yar 1957 


SETS. LOGIC, AND MATHEMATICAL FOUNDATIONS 
by 


Stephen C. Kleene 


University of Wisconsin 


A Summer Institute for Teachers 


of Secondary and College Mathematics 


Sponsored by 


H The National Science Foundation 


| 
| Notes by H. William Oliver 
\ 
| 
| 


| Williams College 


Williamstown, Massachusetts 


1956 


y 
| sr 


a1 


Sets, Logic, and Mathematical Foundations: Corrections and Emendations 


\Renumber page 21 as page 20, and page 21° as page 21. 


tLp, 4 £4, open parentheses before second "x" and close after 
second "B". Similar correction: p. 4 4 10, 
p. 21 (corrected numbering) £. 3b (= line 3 from 
below), p. 22. &. 10, p. 23(1), p. 24 4%. 11b - 10b 
twice. 
} 


Close parentheses: p. 23 end &. 15, p. 26 end &. Dy, pe By 
: formula 26 before "<—>", p. 64 end Sec. 8, p. 75 end 
g. lb, p. 77 end &. 13, p. 79 end J. 7(?), p. 82 end 
g. lb (2), p. 87 & 10d after "P(y)". 


~1 2b (of text), after "belongs" insert "to M". 

| hi 9 change "N." to "N, in which case". 

=) 1S) op and change "g" to "g" (it is not Greek "phi"). 

ae: 36 5 change ", meaningful for" to "of". 

zn 16) 7b delete "actually arise", and before "is" insert 

"are correlated to some member of A". 

- 7 8 for "consists" read "is the set". 

— 7 10 before "whose" insert "in the set". 

= 7 15 delete "case". 

son 23 for "<" read "+". 

—- 10 11 after "1-5" insert "and 1'-5'", 


10 9b delete first "such". 


— 11, to Footnote 2 add "See p. 210 for a reference to Stone's 
1936 theorem that every Boolean algebra can be 
interpreted as the algebra of certain subsets 
of a suitable set U. 


-12 9 before "=s' " insert "=1' ANA". 
- 13 title should read "Sets". 
-1 10b for "the" read "a". 
oS 9b delete "the". 
- 14 8-9 for "(Several" read "(Many". 
= 16 12b for "attain" read "obtain". 
= 16 6b, for "-p/q" read “written p/q in lowest terms 
with positive denominator". 
~ 17 15 for "yn" read "y" (at is not Greek "eta"). 
= 19 lb preferably for "the" read "a". 
Cor- (20 2 make the fourth "."a "3" (7). 
| nected yan 11b for "in" read "is". 
| ee 2 o 12b after "case" insert "that". 


ing 


— 68 


= 66 


— 75 
— 73 


Lee 


Line 
6b for "it is" read "is it". 
7b after "+" insert "(as a binary operation, the +1 
of Peano's axioms being a unary operation)". 
1b the last word is "can". 
1 after "But" insert "B'= n with". 
14b for "halfs" read "halves". 
8 for "derive" read "have". 
1 for second "of" read "in". 
4 delete final comma. 
8b of text for "asuumptions" read "assumptions". 
8b insert comma before "which". 
3 before "set" insert "non-empty". 
7 for "in" read "of". 
3b for "variables" read "letters" twice. Similarly: 
p. 49, @ 5, p. 59, £. 15. 
9 before "for" insert "simultaneously". 
Table VII is for eee & (Pi—> Ps)" (1.e., correct 
third subscript). 
4b for "constructed" read "construed". 
1lb a3 "interchange" read "complete the interchange 
ot”, 
Y for "that" read "whether". 
6b for "axioms" read "formulas". 
8b before "A" insert "make". 
5b for "if | A —> B" read "if A [| B". 
leh “the last word is "any". 
last line of Sec. 8, after "Thence" insert "with 3", 
5b for "form" read "respective forms". 
12 delete "A —> A". 
2b for "propositon" read "proposition". 
10 after "two" insert "(right column)". 
formula 4 should be "B —> A & B". 


to Footnote 7 add "(For Eurasia: Noordhoff, Groningen, and 


5 
8 


14 
16 


North Holland Publishing Co., Amsterdam.) 
after "If" insert "[-". 


after "shall" insert "now"; &. 9 for "suggests 
that they can be" read "is the proper term for 
the case that they are"; and g. 11 for "that" 
read "applications in which". 


for "prime predicates" read "predicate letters”. 
quotation marks at end of line. 


CT Ollleeeeeeeeeeeeeeeeeoeoe———————E———————— 


Page 


eae 
— 76 
— 78 
— 80 
— 61 
—.82 
= 85 
= 86 


Line 
2 
3b 
2b 


15 
10b 


ca 


for "out" read "our". 


for "matter" read "matters". 

after "formula" insert "A". 

for "P(y)" read "P(x)". 

for first "P(y)" read "P(x)". 

for "oceurrence" read "occurrences". 

the lowest subscript should be "i" instead of "n". 
for "exactly" read "essentially". 

better read "the" for "a". 


second line of Formula 51, for "V(x)" read "Vx". 


5 
11b 
4-5 


10 


after "by" read "the". 

for "Theorem 3" read "Theorem 2". 

from ", whence" to the period is redundant here. 
for second comma substitute a semicolon. 

insert " at the beginning. 

for "propositional" insert "predicate". 

for "redundant" read "unnecessary". 

before "Theorem 4" insert "Theorem 3 and". 
before "4," insert "3,". 

delete the quotation marks immediately after WL. 
for "Theory" read "Theorem". 


replace "(the natural numbers )" by "D'"; then on 
the rest of the page replace "D" by "D'". Re- 
place "D'" py "D''" at p.101 £.2b, p. 102, 4. 3, 
and p. 102 £. 13; and at p. 102 4. 4 replace “it 
by "D''(say the natural numbers)". 


for "this" read "there". 


are" 


in Footnote 13 add “Also cf. pp. 425-427.". 


8b 


delete last four symbols. 


after "for" insert "the predicate calculus, case 
of". 
rae ta 


before "for" insert comma. 
text, for "11" read "24" twice. 
for "properties" read "meanings". 


delete "unspecified or". 


" " 


before "=" insert "c" 
for "precisely" read "previously". 
for "in" read "of". 


delete the first comma. 


Page Line ¥ 
—120 ‘s) delete the extra period before "4." 
— 125 1 after "function" insert "'". 
— 125 12b before "Se" insert "0". 
~126 in third line of footnote for "26" read "18", 
— 127 6b for "in terms" read " + instead". 
isl 8 preferably read "of" instead of "for". 
| — 134 7b for "regar" read "re-" 
— 137 9b for "a sequence" read "sequences". 
~140 in title line 1 at left, for "FOR" read "BY", 
—141 8 for "puts" read "prints". 
—148 close parentheses at end of footnote. 
— 150 (ec) for "B" read "B," 
152 to the footnote add "Also see the article by Kemeny, J., 


Man viewed as a Machine, ibid., 

| April, 1955, pp. 58 ff. The present treatment 
will be used in an article Mathematics, Founda- 
| tions of, in Encyclopedia Britannica, LOST 


printing. 
153 2 insert a comma after "that". 
153 6 insert a comma at the end. 
154 13b for "(4).B," read "(ii), ~B,"- 
154 12b pefore "By" insert "". 
Aves) 3 for "3," read "3;". 
| 156 6 after "system" insert "S". 
157 is delete the comma. 
158 8 for "of" read "in". 
158 4 from bottom of text, insert comma before "expresses". 
160 ast line of footnote, for "26927" read 52". 
163 8b change the last comma to "holds". 
165 11 for "system" read "theorem". 
167 11 for the first "by" read ", upon". 
167 5b insert comma before "under". 
168 15 change second "I," to "TI", 


168 footnote, delete "s" and ", and 40, p. 401". 
169 first footnote number should be "*°". 


| 169 3b of text, for "reasonings ... are" 


read "reasoning 
eK ees 


| 
ee ee 


Chapter I 
SETS 

Set and elements 

The concept of "set" is a very old one. The theory of 
sets, as a mathematical discipline, however, dates from the work 
of Georg Cantor (principal publications on this, 1874-1897). 
Cantor's work was a great innovation in that it dealt with in- 
finite as well as finite sets. Prior to his time, there was a 
definite prejudice on the part of most mathematicians against 
the very concept of "infinite" as an actuality. This prejudice 
is exemplified by the opinion of Gauss, one of the most influen- 
tial of nineteenth-century mathematicians. "I protest ... 
against using infinite magnitude as something consummated; such 
a use is never admissible in mathematics. The infinite is only 
a fagon de parler: one has in mind limits which certain ratios 
approach as closely as is desirable, while other ratios may in- 
erease indefinitely." 

Cantor proposed the following as a definition of a set: 
"By a 'set' we understand any collection M of definite, well- 
distinguished objects m of our perception or our thought (which 
are called the 'elements' of M) into a whole." The key words 


here are "collection," 


which emphasizes the unity of the group 
of objects; "well-distinguished," indicating that no merging 
among the objects exists; and "definite." For a definite object, 
it is determined whether or not the object velonaye re shall re- 


consider this definition later.? 


iThere is a full and illuminating discussion in Fraenkel, A.A., 
"Abstract Set Theory," North Holland Publishing Co., Amsterdam, 
1953, pp. 6-18. We commend this book to you as a very careful 
and full exposition of the theory of abstract sets, written (ac- 
cording to the author's preface) "for undergraduates in mathe- 
matics, graduate students in philosophy, and high school teach- 
ers." The author writes with a great affection for his subject. 


Sets 2 
We shall use the words "element" and "member" synony- 
mously, and shall write: 
meM for m is a member of M 


mg M for m is not a member of M. 


. We give now some fundamental definitions of set theory. 


M=N if each member of M is a member of N 


def. 


and vice versa. 


M is a subset of N LE each member of M 
— def. 
| is a member of Now avdeich any 


the improper r M=N 
M is { a proper } subset of N Ef. dar, M AN 


We shall write: 

MCN for M is a subset of N. 

The definition of MCN does not preclude the possibility 
that M=N. In fact, we always have MCM. (Some authors write 
MSN for Mis a subset of N, and MCN for M is a proper subset 
of N.) 

The definition of equality of sets may be restated in 
terms of "subset" as follows: 


M=N if (MCN and Nem) 


def. 
As a notation for a set whose members are explicitly 


given, we shall use the following: 


{a} is the set (called a unit set) whose only 
element is a. 
jab} is the set whose elements are a and b. 
fa.b,o} is the set whose elements are a, b, andc. 


and so forth 
We now consider a method of picking out subsets from a 


given set A. Suppose given a set A and a property P(x) meaning- 


ful for members x of A (i.e., such that P(x) is true or P(x) is 


Sets 3 
false, for each member x of A). Then, as a subset of A, we have 
the set of the x's such that x ¢ A and P(x). We denote this 
subset by 

\x eA | P(x)} 
or 2 (x e A and P(x)) ri 
As an example, we have: 
B= {x ed \ x is even}. 

Here denotes the set of integers, so that B is the set of even 

integers. 

A more complex and also more illuminating example is the 


following: 


B =fn ep | 4.9.2 fey. a | and af gecy? e ol and n> 2]. 


This set B is related to the famous "Last Theorem" of Fermat: 
xh + y” = 2m has no integral solutions x,y,z if n>2. B is 
the set of n's for which such integral solutions do exist. If 
Fermat's conjecture is correct, there are no elements in B. 
Since the truth of Fermat's conjecture has never been established, 
we do not know whether or not B has any members. Nevertheless, 
we wish to be able to talk about such sets as B, and thus we must 
admit as a set the g of the following definition: 

g, the empty set = def. the set which has no members. 
Here we say the empty set, since clearly any two sets with no 
members are equal, by a straightforward application of the def- 


inition of equality (i.e. identity) of sets. 


AUB, the union of A and B = the set which has as elements 


def. 
the elements of A and the elements of B. The union is also 


called join or sum. 


Sets 4 
Example: {a,b,c U {arc.a} = {a,b.c.a} 
The definition of AUB can be restated by giving the 
condition that any object x belong to AUB, thus: 
xe AUB SC Olxec A or xeR 
(In these lectures, and in mathematics generally, "or" will be 
the non-exclusive disjunction, more explicitly "and/or.") 


ANB, the intersection of A and B = the set which 


def. 
has as elements the objects belonging to both A and B. The in- 
tersection is also called meet or (inner) product. 
| xe ANB <—s eA and % 1 B) 

We extend these definitions to include unions and inter- 
sections of more than two sets: 


Let S be a set whose members are sets X. 


(Jx = Ug = the union of the members of S = 
See def. 
XeS 
the set which has an object as a member if and only if the ob- 


ject belongs to some member of S. 


x =(\s = the intersection of the members of $ 


def. 
tne set which has an object as a member if and only if the ob- 
ject belongs to every member of S. 

The set fasb} corresponds to the "combination" of a and 
b, whereas the ordered pair (a,b) (first a, second b as in 
coordinates in analytic geometry) corresponds to the "permuta- 
tion" of a and b. {a,b} = {v.al, i.e, these two sets are 
equal; but the ordered pairs (a,b) and (b,a) are not the same, 
except in the special case that a=b. 

As a simple illustration of how mathematical concepts can 


be defined in terms of the concepts of the theory of sets, we 


can define the ordered. pair or "permutation" of a and b thus: 


Sets 5 
(a,b) = der. {fa,vh, jay] 
ie 
One may then show that (a,b), so defined, has the following prop- 


| erties which are what we wish for ordered pairs: 


1. Given any objects a and b in order, 
there is an object (a,b) uniquely deter- 
mined by a and b. 


2. (ai, bi) = (ae, be) —> a1 = ae and ba, = be. 

(In the proof, after assuming [fara fas}? = {{ae,be}. fas}? 
we deduce a, = as and b, = be by use of the definition of 
equality for sets. Consider cases, according as a, 4 bi, and 
ao Ade; a1 = bi and aa =be; a1 =bi and ae f#ba3 ay # bi 
and a2 = bs. The third and fourth cases are dispensed with by 
showing that they cannot actually arise under the assumption. ) 

N. Wiener (1914) was the first to give a definition of 
ordered pair in terms of sets; the present definition is due to 
Kuratowski. 

We can extend the idea of ordered set to include ordered 
triples, ordered quadruples, etc. 


(a,b, ¢) = gop, ((2,d).¢) 


i.e., an ordered triple is an ordered pair whose first element is 
an ordered pair and whose second element is an object. Similarly, 
an ordered quadruple is an ordered pair whose first element is an 
ordered triple, and so forth. 

The concept of ordered pair gives rise to the notion of the 
Cartesian product (or outer product) AxB of two sets A and B: 


AXB = def. the set of ordered pairs (a,b), where ae A 
and bebB. 
We can further define AXBXC as the set of ordered triples 
(a,b,c) with ae A, be B, ¢ € C3 or, equivalently, 
AXBxC = (AB)xC. 


6 


We consider another operation on sets, which bears a slight 


resemblance to subtraction: 


A - B, the difference of A and B = def the set whose members are 


those objects which belong to A but do not beloyg to B. 

. If we are given a property P(x), at members 
x of a set A, we have constructed, in set theory, an object (a 
subset of A) which represents this property, namely, {x € Al P(x). 

We shall also want to consider relations between two (or 
more) objects. If x is in the relation R to y, where xe€ A 
and y € B, we shall write R(x,y) or x Ry. This relation may be 
represented in set theory by the set whose members are the ordered 
pairs (x,y) of integers such that x Ry. This set is a subset 
of AXB. Of course, A and B may be the same or different. 

For example, the familiar relation x < y, where x and y 
are integers, is represented by the set: 

{(1.2), (1,3), (2,3)> one Ha 

(3,1), for instance, is not a member of this set, whereas (7,15) 

is. 

To a special type of relation we give the name function: 
A function y = f(x) from A to B is a correspondence by which to 
each object x ¢ A there is correlated an object f(x) « B. A is 
called the domain of the function. The subset of B consisting of 
those objects of B which aetuablyarise in eit h tee leo 
called the range of the function. Of course, A and B may be the 
same or different sets. 

We can represent a function y = f(x) by the set of the 
ordered pairs (x,y) where xe¢A and y =f(x). Thus it becomes 
a special kind of a relation under the above representation of re- 


lations. It is distinguished among relations by the condition 


Sets 


that, for each x € A, there is one and only one ordered pair in 
the relation whose first element is x. 

Example: y = x2 + 2x +5. 
Then (1,8) belongs to this function, since 8 = 17 + 2:1 +5, but 
(2,7) does not belong. 


A function of more than one variable is defined similarly: 


Given sets Aj, Ao, ..-., A., and B, the function y, = PA a5 {Kis + averse ) 
w& ue > qeyesB n 
eonststs of (n+1)-tuples (xa +++ 2% 29), uch that, for each way 

of mae a eae, x, eAy> there is one and only one 


(n41 )-tuptaymnose first n elements are x1, -.., x, in that order. 

Example: y = xX, + Xo 
(1,-1,0) belongs; (1,2,1) does not. 

When considering such expressions as "x < y," it is neces- 
sary to make clear whether we intend the proposition x < y (which 
eese is ambiguous unless x and y have been specified), or we are 
talking about the relation x < y. 

For example, when we say "when x < y, the difference y - x 


is positive," we are using "x < y" in the first sense; and when we 


"we are using "x < y" in the second 


say "x < y is transitive, 
sense. To distinguish the latter case, we may use the notation 
introduced by Church (in 1932): A xy x <y. Similarly in the 
case of a ternary relation: Ax,Xsy y = Xi + Xe. Similarly 
"~ xy hee a denotes the function x +y as distinguished 
from its so-called ambiguous value, Ax x +y denotes x+y as 
function of x with y as parameter. 

We conclude this section with the following definitions: 
A and B are disjoint (or mutually exclusive) if der, they have no 


common members 


or, in symbols: A and B are disjoint <—> ANB = @. 


Sets 


A set S of sets is disjointed (or the sets of S are mutually 


exclusive) if each pair of the sets belonging to S is dis- 


def. 
joint. 


Subsets of a given set 


Consider a fixed set U, called the universal set. All 
sets A, B, C, ... in this section are to be subsets of U. We 
define: bare the complement of A = def. U- A. Then, according 
to our definitions, each of AUB, ANB, g, U, and “hk 1s a gub- 
set of U, provided that A and B are. 

There is a significant algebra of these subsets, which we 
shall partially develop by listing the fcllowing formulas, which 


are true for any subsets A, B, C of U. 


1. AUB = BUA 14. ANB = BNA 
2. (AUB)UC = AU (BUC) 2'. (ANB)Nc =Aan (BNC) 
3. AU (BAC) = (AUB) (AUC) 3. an(Buc) = (AfB)U (ANC) 
4. AUS = A at. ANU = A 
5. AU“A = U 5'. ANA = @ 
6. If, for every A, AUB=A, 6!. If, for every A, ANB =A, 
then B = g. then B = U. 
7,7'. Tf AUB=U and ANB=@, 
then B =A. 
8,8'. “YA = A 
% es Sw 91. Mu = g 
10 AUR = A 10'. ANA = A 
i. AUT = & 1l'. ANG = Gg 
12. AU(ANB) = A 12'. AN(AUB) = A 
a3. (ade) ona“ is. “tagip) “= “20 
uz. w0Bp = “(“an™B) 141, afte. =~ (aoa) 


Sets 9 
To prove these formulas, we shall make use of a schematic 
representation of sets calleda Venn diagram. The universal set U 
is represented by the points inside a square, and a subset A of U 
by the interior of a circle within the square. Then A will be 
the part of the square outside the circle. In considering a for- 
mula, such as 1, which deals with two subsets A and B, there are 
exactly four possibilities for any point x ¢ U: (1) x € A and 
xeB, (2)xeAandx¢B, (3)x¢AandxeB, (4) x ¢ A and 
x ¢ B. These four possibilities are schematically represented by 
the correspondingly marked portions of the Venn diagram. (See 


Fig. 1.) 


e 


3 


ee 


Fig. 1 Fig. 2 


Thus the Venn diagram gives schematically an actual proof (which 
could be given also by treating the four cases verbally), and is 
not merely an illustration. 

We illustrate this method of proof by proving formula 3: 
AU (BNC) = (AUB)M (AUC). The Venn diagram is given in Pig. 2. 
A is (represented by) 1, 2, 4 and 5; BNC by 5 and 6. Thus 
AU(BOC) ts 1, 2, 4, 5 and 6. 


AUB d6 dy 2, 35 45 S and és AUIC is 1, 2, 4, 5, 6 and 7. Thus 


(AUB)Q(BUC) is 1, 2, 4, 5, and 6, the same as AU(B/C). 


RR a a 


] 


Sets 10 

The remaining formulas 1-5 and 1'-5' may be proved in a 
similar (and in most cases easier) manner. The formulas 6-14 
and 6'-14' may also be proved by this method, but it is an im- 
portant fact that these later formulas are deducible from the 
first five. In fact, we could have introduced this algebraic 
system abstractly (and not as the algebra of the subsets of a 
given set) in the following way: 

Let S be a set with elements A, B, ..-, with two dis- 
tinguished elements @ and U, which is closed under two binary 


operations U and /), and closed under a unary operation ™, and 


tet 
which satisfies formulas 1- - Sach a system is called a Boolean 

Thus we have proved that the set of all subsets of a given 
set U (denoted by 2 Uy is a Boolean algebra. 

We may also form Boolean algebras by selecting not 
all, but some, subsets of a given set, provided that the closure 
properties are satisfied and % and U are included. For this 
purpose we may "agglomerate" the set U so that its elements stick 
together into "lumps," or in mathematical language we partition 
U into non-empty disjoint subsets (of which U is the union). Now 
consider as members of S only sue subsets A of U chosen in such 


away that the elements of each "lump" either all belong to A or 


all do not belong to A. To give a specific example, suppose U 
is the integers. Besides the Boolean algebra in which S is the 
set of all sets of integers, there is also the one in which S is 
the set of sets A of integers such that any two integers which 
differ by a multiple of 3 always either both belong to A or both 


do not belong to A. 


In any Boolean algebra, the principle of duality holds. 


celal ee 


Sets 


11 
This principle is emphasized in the numbering of our formulas: 
each primed formula is the dual of the corresponding unprimed 
formula. The dual of a formula is obtained by replacing (/ by /), 
Noy U, g by U and U by @ throughout. Since the closure 
properties and the formulas 1-5, 1'-5' contain to each its dual, 
it follows that if any theorem (using only the notions S,U,/), 
g@, U,~’) is provable in the algebra, so is its dual. Thus, in 
order to prove 6, 9-14, 6', 9'-14', it suffices to prove only 
the unprimed formulas, the others following by duality. Certain 
formulas, such as 7,7' and 8,8', are left unchanged by the oper- 
ation which converts them into the dual formula; such formulas 
are called self-dual. 

To illustrate the deduction of 6-14, 6!'-14! from the 
closure properties and 1-5, 1'-5! (instead of by the Venn dia- 
gram), we give several.? 

Proof of 6. Assume that @ AUB =A _ for every A. Then 
we infer B=Q% as follows: (The subscripts on = show what is 
applied at each step.) 

B=. BUG =, GFUB =@s. 

Proof of 7,7'. Assume that @ AUB = U and 
@ ANB = G. Then 
B=. BUS =5° BU(ANYA) =3 (BUA)M(BUVYA) = 


(A UB)/ (BU Ya) =@ UN(BU~A) = 4! (BU~A) NU =4' BUYA. 


2The remaining proofs will be found (making allowance for dif- 
ference in notation) in Stabler, E.R., An Introduction to Mathe- 
matical Thought, Addison-Wesley Publishing Co., Cambridge, Mass., 
1953, pp. 197-200. dec 4.210 Yaa & Sle 'o 


wnteg Porky hpclna, Can. bee vile fucleder thi, Myon ff exci. 2s ofr onitarre a 


‘+ ai. | 


Sets 12 
Similarly, 
“ny a4 VAUG =@ ~AU(ANB) =3 (“AVA)A(YAUB) =: 
| (auU“A)ON(“AUB) =5 UA(YAUB) =1' (“AUB)AU = 4! “AUB. 
But 
BUYA =, “AUB. 
Combining these three results, 
B = Bu~A = “AUB =“A. 
Proof of 8,8'. We have: 
a 
“RUA = au MURky, and “ANA =s! Q. 
So, applying 7,7', 
with “A and A as the "A" and "B" respectively, 
A =~, 
In the present section, where A, BCU, the subtraction of 


wv 


sets can be expressed in terms of M , thus: 
A-B = ANM(~B). 
Of course the duality transformation should not be applied di- 
rectly to a theorem which contains -; the - should first be 
replaced by its equivalent given by this formula. 
As one further example of expressing concepts in the terms 

taken as primitive in Boolean algebra, we have 

ACB <—> AVB=B<— ANB=A. 


3. Countable and uncountable sets 


The characteristic discoveries of Cantor in set theory in- 
volve the existence of an actual infinite, and not just a quan- 
tity. increasing beyond ail bounds. An infinite set is one which 
is not finite. The definition of a finite set depends upon the 
set of positive integers: 1, 2, 3, 4, ... (which we assume as 
known) and the idea of 1-1 correspondence. 


A finite set is one which can be ‘put into 1-1- correspond- 


i ea. 


Set 13 
ence with an initial segment of the set of positive integers. 
E.g-, a set with three elements is finite, since it corresponds 
to the initial segment 1, 2, 3, the empty set, since it corre- 
sponds to the empty initial segment. 

B The idea of 1-1 correspondence is fundamental in this def- 

inition, and is more basic than the idea of cardinal number. To 

illustrate the notion and also the point of view which makes this 
notion basic, consider the following (not too impractical) ex- 
ample: In an aboriginal tribe which cannot count beyond twenty 

(such tribes exist), a chief is to be chosen from two candidates 

A and B by awarding the position to the applicant with the larger 

herd of cattle. The two herds are run through a gate, with a pair 

of animals, one from each herd, passing together through the gate. 

If A's herd is exhausted before B's, B is chief; and vice versa, 

A wins if he has cows left when all of B's are gone. If the last 

two cows walk through together, a different method of selection 

must be used, or a co-dominion established. Even if each herd has 
more than twenty cattle, and thus could not be counted by the 
tribe, this method of pairing works. 

We define 1-1 correspondence between sets A and B as +hea 
set of the ordered pairs (a,b) where ae A and be B such that 
each member of A occurs as the first member a of one and only 
one pair (a,b), and each member of B occurs as the second member 
b of one and only one pair (a,b). Equivalently, we could define 
1-1 correspondence between A and B as a function which possesses 
an inverse, i.e., each element in the range B of the function 
comes from exactly one element in the domain A. 
| Cantor was the first to apply this notion systematically to 


infinite sets. An isolated application of the idea occurs in 


EEE) 


eT 


14 
Galileo's Paradox of 1538. Galileo observed that the positive 
integers correspond 1-1 to the squares of the positive integers, 
i.e., 1-1, 2-4, 3-9, ..., thus exhibiting a 1-1 corre- 
spondence between a set and one of its proper subsets. This situ- 
ation appears to contradict Euclid's axiom that the whole is 
greater than any of its parts. 

A third set which is in 1-1 correspondence with the inte- 
gers is the set /] of natural numbers: 0,1, 2, 3, ++. - (Sea Mapu 
evel writers use the term "natural numbers" instead for the 
positive integers 1, 2, 3, ..., which leaves available only the 
more cumbersome name "non-negative integers" for 0, 1, 2, --- 


Moreover, we feel that the number O deserves to be included as 


"natural." For technical reasons, it is more convenient in our 
subject to take 0, 1, 2, --- as the basic set of numbers than 
iy. By Sy wan «) 


A set A is countably infinite if def. there exists a 1-1 
correspondence between A and 7}. A is countable if it is finite or 
countably infinite. Synonyms. for countable are enumerable and 
denumerable. (When it is already clear that a set A is infinite, 
it suffices to say "A is countable" to express that A is counta- 
bly infinite.) An enumeration of a set Ais a specific 1-1 corre- 
spondence of A with 7). 

Many sets of numbers, important in mathematics, are enumer- 
able. The set of all integers _f is countable, as the following 
enumeration shows: 

hi OO 2 £ SB £ & G ses 
ls WS =i 8 Bh SB ~B avs 
More surprisingly, the set of rational numbers is enumera- 


ble. This situation seems paradoxical on geometrical grounds, 


a 


Sets 


15 
since the rationals are dense on the real line, while the points 
with the natural numbers (or the integers) as coordinate are dis- 
crete sets (i.e., each point is isolated). Nevertheless, 1-1 
correspondences do exist between the rationals and the natural 
numbers. We shall exhibit three. 

(i) Any rational number can be expressed as a fraction of 
integers with denominator positive. Of course, this fraction is 
not unique; e.g., the two fractions -1/2 and -2/4 represent the 
same rational number, but are to be regarded as different frac- 
tions. We begin by tabulating these fractions (i-e., the frac- 
tions p/q, where pé JL and q ei} - fot) according to the 
following scheme; in each horizontal row, the numerators are 


listed according to our previous enumeration of the integers. 


o/l W— -1f1 2/1 —> -2/1 wae 
a yet 
a 1/2 ays js “2/2 


o/s 1/3 Poe 2/3 -2/3 


1/4 -1/4 2/4 -2/4 wae 


We "enumerate" these fractions by following the arrows, 
starting at O/l, thus: O/1, 0/2, 1/1, -1/l, 1/2, 0/3, 0/4, 1/3, 
-1/2, @/1, -2/1, 2/2, -1/3, 1/4, ... . Reducing the fractions to 
lowest terms and omitting the denominator when it is 1, we obtain: 
O% Oy 1g =Up 1725 Oy On I/Sy <1/25. 25 <2) By “BBs U/45 was 5 
which can be interpreted as a list with repetitions of the ra- 


tional numbers. Omitting the repetitions, we have the rational 


Sets 


16 
numbers enumerated as follows. 
O; 2, <1, U2, 18, 172, 25 =2, s/s, T/ts ... « 

(ii) We can represent each rational number in the form 
(-1)" p/a, where n, p, and q are natural numbers and q # 0. 
This representation is unique if n, p, and q are chosen to be as 
small as possible. E.g.: -18/60 = (-1)* 3/10, son=1, p = 3, 
q=10. (n always = 0 or 1.) To the rational number represented 
by (-1)" p/a we make correspond the natural number 2” 3P 5st, 
which we call the index of the rational number. Notice that 2, 
3, and 5 are distinct prime numbers. Because of the uniqueness 
of factorization in the positive integers, different rational 
numbers have different indices, i.e., the correspondence between 
the rational numbers and their indices is 1-1. For example, 
33,750 = 21-35-5*, so it is the index of -3/4 (n =1, p = 3, q=4) 
and only of -3/4. (But 28 = 22°39-5°*71 4s not the index of any 
rational number for several reasons: it has 7 as a factor, it 
has 2 as a factor too often, and it has 5 as a factor too seldom.) 
To peouee the enumeration, we list the rational numbers in the 
order of magnitude of their indices. This method of enumeration 
uses an idea of Godel, who made significant use of it in another 
connection, as we shall see later in these lectures. 

(iii) The following method may be called "the method of 


digits." We imagine a typewriter with, only Be kes: 0,1, 2, 3, = \ 


fi 


4,5, 6, 7, 8, 9, -, /« Any rational make 
na 


represented by a succession of these twelve symbols. We can re- 
interpret this set of symbols as the representation of a natural 


number in the duodecimal system, where the "-" 


represents "ten" 
and "/" represents "eleven." For example: the rational number 


-2/3 (minus two-thirds) is reinterpreted as the natural number 


CO ————EeEeEeEEEIEIEIEIEe——LC€&€CS=——__E_E_E=_=_=___ 


Sets 


aby 
17,703 = 10°129 + 2:122 + 11°12 + 3. If we then list the ra- 
tionals in order of magnitude of the natural numbers thus repre- 
sented, we have an enumeration of them. 

The methods which we have applied can be used to prove 
the following general statements. 

If A and B are countable sets, so are AUB, A/)B, and 
AXB. (If A and B are countably infinite sets, so are AUB and 
AXB.) If S is a set of sets A, and S is countable, and each 
A € § is countable, then US is countable. 

Thus our proof that the integers are countable shows more 
generally that AUB is countable when A and B are. Our first 
proof that the rational numbers are countable (or, with modifica- 
tions, even the others) gives a proof that AB is countable 
when A and B are (thus, when we enumerated the fractions, A was 
| and B was) - {9}). The proof that when S is a countable set 
of countable sets US is countable can be given by writing enumer- 
ations of the members of S in horizontal rows and following the 
arrows as in (i) above. 

We now consider another important countable set of numbers. 
the real algebraic numbers. 

A number x is a real algebraic number if def. * is a real 
root of an algebraic equation 


(1) Bok” te aax? feces # a,4X +a, = 0 (a. # 6) 


with integral coefficients, i-e. with the a's ext. 

To show that the set of algebraic numbers is countable, we 
shall begin by showing that the set of algebraic equations (1) is 
countable. We can do this by the method of digits (which we used 


in (iii)). The digits this time are 0, 1, ..., 9, X, +, -» =, 


4 


icc: 


Sets 18 
fourteen in all. So long as we deal only with equations of the 
form (1) with numerical ao, «+., ay, and n, the exponents can be 
written on the line without ambiguity. Then, assuming that the 
a's and n are written in the decimal notation, the equation can be 
reinterpreted as a positive integer in the quattuordecimal system 
(i.e., the system of positional notation with fourteen digits). 

We enumerate the equations in the order of the resulting quattuor- 
decimal numbers. 
Now since equation (1) has at most n (distinct) roots, we 

| get a list with repetitions of all real algebraic numbers by re- 

placing in the enumeration of the algebraic equations each equa- 

tion by the finite list of its real roots. Finally we get the 
desired enumeration of the real algebraic numbers by striking out 
the repetitions in this last list. 

By this time, one might begin to suspect that all infinite 
sets are countable. Cantor's decisive contribution here was to 
show by his famous "diagonal" method (1874) that there are un- 
countable sets. We shall give three examples of uncountable sets. 


Our first is the set of all (1-place ) number-theoretic functions 


(or number-theoretic functions of one variable), i.e., all fune- 


tions from n to N: Examples of such functions are: f(x) = x?, 


| e(x) = 2x +1, h(x) = 0. This is certainly an infinite set, since 

| it contains all the constant functions. 

To prove that it is uncountable, we begin by showing that 

| if an enumeration fo(x), fi(x), fe(x), ... of a set of number- 
theoretic functions is given, then another number-theoretic func- 
tion f(x), different from all those in the enumeration, can be 
constructed. This of course renders it absurd that the given 


| enumeration be one of all the number-theoretic functions (i.e., 


the set of the functions which are in the [ekela enumeration is a 


SS a) 


Set 


19 
proper subset of the set of all the number-theoretic functions). 


Each function in our enumeration can be represented by a table 


giving its values for x = 0, Lg By so, thus: 
fo(x): fo(0) fo(1) fol2) fol(3) «- 
; fi(x): £,(0) > eth) #2) £205) nx. 
fa(x): fel0) fe(1) *f2(2) t2(3) 


falx): (0) fa(1) fo(2) fal3)--- 


We construct a new function f(x) by defining f(x) = f(x) +1. 
Thus f(x) is constructed by altering the "diagonal" elements in 
our table (see the arrows), so that f(x) differs from fo(x) in the 
value for x = 0, from f,(x) in the value for x = 1, from fa(x) in 
the value for x = 2, etc.; thus f(x) differs from each of the 
functions in the given enumeration. 

To express the argument another way, suppose f(x) were in 
our list, say it is f(x) (the function appearing in the (q41)st 
row); thus we would have f(x) = fy (x) for all natural numbers x. 
So, in particular, f(q) = f4(q)- But, by substituting q for x in 
the definition of f(x), we obtain f(q) = fy (a) +1. Thus 
fyla) = fyla) + 1, which is a contradiction. 

We next show that the set of all real numbers is uncounta~ 
ble. In discussing the real numbers, we assume that every real 
number can be represented as the sum of an integer R ("“character- 
istic") and a positive infinite decimal ("mantissa"). In some 
cases, this decimal can be either terminating or non-terminating; 
e-.g., -3/2 = -2 + .50000..., = -2 + .49999... . If, in each such 
case, we choose the form with repeating 9's instead of repeating 
O's (the "non-terminating" decimal), the representation is unique. 
Further examples: 7 = 3 + .14159..., O = -1 + .99999..., 

2/3 = 0 + .66666... . Assume that ste set of real numbers has 


ma aaa || 


Sets 20 
been enumerated: Yo, f1,; Pa; «+. - We then have, for 


TE SHO%, dea iy deceses “F 


ne Ra + -fyo Pna Tne -- 
We form the real number r = 0 + -rdo rli rhe ale 3) 
where 
tan = Big ee if fan #9 
He. 2 4 if e346 


Then, on the one hand, r does not terminate (1i.e., does not have 
all O's after some digit), since none of Poo, Yi Lao; jen USTION 
On the other hand, it is not in our list, since it differs from 
Yo in the tenth's digit, from r, in the hundredth's digit, etc., 
i.e., for eachn (n=0, 1, 2, ...) it differs from rp, because 
Pan # Pon‘ 

Since the set of the real algebraic numbers is countable, 
and the set of (all) the real numbers is not, there must be real 
numbers that are not algebraic, i-.e., that are transcendental. 
| By applying Cantor's diagonal method to the case i Pas, Pay eis 
joa particular given enumeration of the real algebraic numbers, 


we get as the r a particular transcendental number. (Transcen- 


dental numbers were first shown to exist by a more special method 
of Liouville in 1844.) 

We now consider ail = the set of all subsets of ), and show 
that this set is also uncountable. In the present context, it is 
helpful to use the following representation of a subset of 4 as 
a function from q to {o,1}: if f(x) is the function represent- 
| ing the subset A, f(x) = 0 or 1 according as x¢ A or x fA. 
| Thus the function representing @ has the value 1 for all x es 


| the function representing a set including 0, 2, 3 but not 1, 4,5 


| 


SC —_— 
Sets 21g 


has the value 0 for x =0, 2, 3 and the value 1 for x =1, 4, 5 


(the other values depending on which numbers > 5 belong). 


Schematically: 
co ee ee ee ee 
~  g : i 2 &@ & 2 awe 
fo, 2, 3, ve} : Oo Lf oO 2 GL wes 
{2 re i 6 @ 2 EF Rh aw 


Clearly this representation is unique, and each function from yy, 
to [o.2} represents a subset of 7). To any given enumeration of a 
set of such functions, we can construct a function not on the list 
by altering each diagonal element, changing O to 1 and 1 to 0. 
Thus the given enumeration is not an enumeration of the set of all 
such functions. Therefore, the set of (all) the sets of natural 
numbers is uncountable. In contrast, the set of the finite sets 
| of natural numbers is countable. 
4. Cardinal number 
In section 3, we saw that there are two infinite sets be- 
tween which there exists no 1-1 correspondence, namely, the set of 
the natural numbers and the set of the real numbers. We shall 
also express this by saying there are at least two infinite car- 
dinal numbers. In this section we investigate more fully the 
existence and properties of cardinal numbers, both finite and 
infinite. 
As a preliminary, we list the following properties of the 
| subset relation, which are immediate from the definition of "cc": 
ACA 
(AcB and acd— ACC. 
The conteat idea in the theory of cardinal numbers is the 


concept of equivalence of sets: 


ET ee 


Se 


Sets 


faye 

AvB (in words, A is equivalent to B) if 4, there is 
a 1-1 correspondence between A and B. 

(This use of the symbol "~" denotes a binary relation be- 
tween sets, and should not be confused with the unary operations 
aay te complement of a set A, or ~A, negation of a proposition A.) 

The relation "~~" has all the properties of a formal equiv- 
alence relation: 

AvA 
AvB— BYvA 
(a~s and B~ > Awe 
These properties can easily be deduced by using the definition of 
1-1 correspondence as a set of ordered pairs. We carry out the 
proof of the third property (transitivity) in detail: Assuming 
that there is a 1-1 correspondence between A and B, and one be- 
tween B and C, we have to show a 1-1 correspondence between A and 
C. For each member ae A there is exactly one ordered pair 
(a,b) in the 1-1 correspondence between A and B. For the be€B 
determined by this pair, there is exactly one ordered pair (b,c) 
in the 1-1 correspondence between B and C. The set of ordered 
pairs (a,c) so obtained is a 1-1 correspondence. For, first, each 
aeéeA occurs exactly once as the first member of a pair. Second, 
each c€C occurs as second member of only one pair (a,c) (since 
it occurs as second member of only one pair of the form (b,c), 
be B) and each c occurs as second member of at least one pair 
(since the b in this pair (b,c) occurs as second member in some 
pair (a,d)). 

There are various definitions of "cardinal number" which 
have been adopted by different authors. Frege (1884) and Bertrand 
Russell (1902) suggested the following: the cardinal number A 


eee 


Sets 23 
of a set A is the set o! all suts B where BwA. Another method 
(von Neumann, 1928) of designating an object to represent the car- 
dinal number of a set A is to select a fixed set C from this set 
of sets B, and call this set C the cardinal number R of A. Under 
any such method, one obtains what is essential; namely, that an 
object is associated in common with all the sets equivalent to one 
another. That is, equivalent sets have the same cardinal number; 
or, in symbols, 

R=B<—>aAvB 

(Here A is Cantor's notation for the cardinal number of the set 
A. The double bar indicates that two abstractions take place in 
passing from the set to its cardinal number. The first is the 
abstraction from the properties of the individual elements; the 
second from the order in which he thought of the elements as being 
given) 


We now establish an ordering among cardinal numbers of sets 


py the following definitions: 


R<8 ae ger, FOr some Bi, AvBicB, put for no 


Ai, BvAicA. 


Then we write A>B for B<A, A<B fo: 


*S 
— 
Pil 
a 
wi 


or 


= B), A x B for not A <B, ete. 


>it 


This ordering has the following properties: 
(1) (R<B ona B< Gace 
(11) The following three statements are mutually ex- 
clusive: A < B, A = B, A > B. 
(444) If ACB, then A ¥3. 
The proofs of (11) and (iii) are easy consequences of the 


definitions of ~ and <. The proof of (i) is graphically indi- 


cated in the following diagram: 


eee en 


Sets 24 


Aa 


The vertical lines represent the sets A, B, C. The slanting solid 
line from the left-hand A maps A onto B,¢B by "the correspond- 
ence" AB, (that is, some particular 1-1 correspondence between 
A and By, which "AvB," asserts to exist). The equivalence A~vB, 
is required by the first half of the definition A <B. The slant- 
ing solid line from B maps B onto C,CC by the correspondence 


Bw, of the first half of the definition of B<. In the cor- 


| respondence BwvC,, the elements in the subset B, of B correspond 
1-1 to elements in a subset C2 of C1, as indicated by the dotted 
line. ‘Thus (since (A~vBa and Bia) —> A~wCa, and (ceces and 
Cyc — C2¢C) we have AwCeC, so that the first half of the 
} definition of A < Cc is satisfied. We have used only the three 
left-hand vertical lines A, B, C in this part of the proof. 

We show that the second half of the definition of A < C 
is satisfied, i.e., that for no iat, ee true that C~Ai, by 
assuming the opposite and reaching a contradiction. We assume 
that a 1-1 correspondence maps C onto A, CA; as indicated by the 
slanting solid line from C to the right-hand vertical line repre- 
senting A. The composition of this mapping with the mapping of B 


onto C, already sketched gives the correspondence BwvAeac i CA, 


N 


 ————— 


Sets 2s 
which contradicts the second half of the definition of A < B. 
Our next objective is to discuss the cardinal numbers of 
finite sets. 
We shall begin by presupposing familiarity with the natural 
numbers 
| Oy Wy (Bh dee 
as a system of objects. Although the understanding of this system 
of objects is fundamental, and it cannot be properly defined in 
terms of anything simpler, we may perhaps clarify what we mean by 
describing it thus. The natural numbers consist of exactly those 
objects which can be generated by starting with a first object 0 
(zero), and from any object n already generated passing to another 
object n+1 (the successor of n). Objects differently gener- 
ated are distinct. In particular, it follows from this descrip- 
tion that: 
(1) © is a natural number. 


(2) If n is a natural number, then n+1 isa 
natural number. 


(3) For any natural numbers n and m, if 
n+1l1=m+i1, thenn =m. 


(4) For any natural number n, n+1 #0. 


(5) Let P be a property meaningful for natural 
numbers. If 


(I) © has the property P, and 
(IT) for any natural number n, if n has the property P 
then n + 1 has the property P, 
then every natural number n has the property P. 
(Restated a little more concisely, writing P(n) for "n has the 
property P" or generally for any proposition about a natural 


number n: If 


Sets 26 


(I) P(O), and 
(II) for every natural number n, if P(n) then P(n+1), 
then, for every natural number n, P(n)). 


Properties (3) and (4) are implied by our understanding 
that differently generated objects in the natural number sequence 
are distinct. Property (5) is the Principle of Mathematical In- 
duction. From the present standpoint under which the natural num- 
bers are intuitively understood, it needs no more proof than the 
following. Whenever (I) and (II) are established for a particular 
P, we are in a position where, as we generate the natural numbers 
successively by starting with O and repeatedly passing from n to 
n +1, simultaneously we can as each number is generated recognize 
that that number has the property P. 

The five properties (1)-(5) are Peano's famous postulates 
(1889). They are often stated for the positive integers instead, 
with "1" as the first object instead of "0." It is immaterial 
which symbol is used, as long as the objects are being considered 
abstractly. However, for the applications we have in mind, in 
particular, the application as finite cardinals, we need the Nog" 

In the following we presuppose the reader's familiarity 
with properties of the natural numbers, all of which can indeed be 


deduced from Peano's Postulates, if one is also allowed the right 


stractly, we mean that we are only considering them as a syste 
of objects without intrinsic properties individually, and known 
only through their position in the natural number sequence. Now 


we shall show how the familiar symbols for the natural numbers can 


i _ — .-®:\ QS an 


Sets 


rari 
be adopted as symbols for the cardinal numbers of certain sets; 
or, what comes to the same thing, we shall make an application of 
the natural numbers in which they become cardinal numbers. 


For this purpose, we adopt the following two definitions. 
0, the cardinal number O = gop, 1) 


For any cardinal number M, M+1 = def. M Ufa}, 


where a ¢ M. 


We must show that these concepts are well-defined in terms 


of our basic principle: A =B <—> AwvB. O is well-defined, 
since the only set equivalent to is @ itself. That M +1 is 
well-defined is shown by the following proposition: 

If Mu fa} uN and ag M, then N =Ny Upd}, where M-uN, 
and b ¢ Ni. 

Proof. Let b be the element in N to which a corresponds 
in the 1-1 correspondence M Ujal~n, and let Ny =N - {d}. After 
eliminating the pair (a,b) from the given 1-1 correspondence, we 
have a 1-1 correspondence between M and Nj. 

Now the two definitions just given, together with the un- 
derstanding that in case M =n a natural number, M +lean4i1 
should receive its usual symbol (e.g. 5+1 = 6), lead to an 
assignment of each natural number to serve as cardinal for certain 
sets. 

The natural numbers in the rdle of cardinal numbers we call 
finite cardinal numbers (the other cardinals will be called infin- 
ite or transfinite ). A set which has a finite cardinal (i.e. 

a natural number as a cardinal) is finite; otherwise, infinite. 


This new definition of finite set will appear by Theorem 1 


below to be equivalent to the one we gave in Sec. 3. 


le —————————————————————s 


Sets 


28 

The natural numbers now play a dual role: on the one hand, 
they were presupposed as a basic set of objects with the familiar 
natural ordering, to distinguish which ordering we can write < as 
<y . On the other hand, they have been applied to serve as sym- 
pols for cardinal numbers, and as such are subject to the ordering 
of cardinal numbers generally, as it was defined above following 
Cantor. To distinguish this latter ordering we can write < as <o: 
Now we shall show (in Theorem 3) that these two orderings agree, 
i.e., that n <y ongm, (If it did not turn out thus, our 
application of the natural numbers to serve as finite cardinals 
would be extremely awkward.) 


Theorem 1. For each natural number n, the finite cardinal 


n is the cardinal of the set of natural numbers which precede n 


in the natural order; i.e. if n is a natural number, the cardinal 


we fOrdg mung We LY Ge Gly ey ene HD) 


Proof. By mathematical induction on n. 

By the above formulation, this requires us to establish two things. 
(I) We must prove the proposition forn=0. Ifn=0, 

there are no natural numbers preceding n in the natural order, 
i.e., the set referred to in the theorem is @; and o = 0 by def- 
inition. 

(II) Now we assume (as "hypothesis of the induction") that 
the cardinal n= 0, 1, --+) n-l} . From this assumption we must 


infer the like with "n" replaced by "n + 1": i.e., we wish to de- 


duce that the cardinal n+1 = fo, iy a4 nt - Now n is dif- 
ferent from each of 0,1, ..., n~- 1; and So, le see Wes n} 
= FG. Us meee BS 1} u {n}. Thus, by the definition of M +1, 
n+l = FO, ipa, B= i Oy . This completes the proof by 


mathematical induction, i.e., now that we have supplied (I) and 


eee 


Sets 


29 
(II) for the particular P(n) the Principle of Mathematical In- 
duction allows us to conclude that for every natural number n, 
P(n) which is the theorem. 


Theorem 2. For each natural number n, if A =n, then A 


is not equivalent to a proper subset of itself; i.e., for each 


natural number n, if A= n, then not (AvA\CA and A, # A). 


Proof. By mathematical induction on n. 

(I) Take n=0. Then A =0 and A=Q%. Then A has 
no proper subsets, so (AVA,GA and A, 4 A) is impossible. 

(II) We assume, as hypothesis of the induction, that, for 
any set M, if M=n, then not (M~M,CM and M, 4M). (We are 
changing the "A" to "M" to avoid confusion with the "A" in 
"P(n+1)".) We must infer that, if A=n+ 1, then not (AY AICA 
and A, # A). Accordingly, assume AR =n+1. We shall derive a 
contradiction by assuming now, contrary to what we wish to prove, 
that AWYA,CA and A, AA. Since A=n+1, A= BU{b{, where 
B-=n and b g@ B. Let by be the element of A, to which b corre- 
sponds in the correspondence A~wA,. If b #bi, switch b and 
b, within A to get a new 1-1 correspondence and let B! be the 
result of replacing b, by b within B in this case. In the other 
case, when b = bz, let B'! be Bitself. Let By = Ai - Joa}. 

Now we have: 


BIU4 ba} = BU{ >} = AYA. = BU {bait, with bi ¢ Bi, bi ¢ B! 


So B'o=n (since B'vB- and B = n) 
B!o~v By 
Bi c Bt (since Aic A) 

and Bi # B! (for if By = B', we would have 


Ay = BU | bal = BU Pa} = A, which 


contradicts the assumption that A, # A) 


30 
ButX B,cB', B'twB,, and Bi ?~ B' contradict the induction 
hypothesis, applied by taking B', B, as the "Mm" and "M,." 


Theorem 3. The natural ordering and the cardinal number 


ordering agree for the natural numbers; i.e., for any natural 


numbers n dm, n 


Proof. Part I. Suppose n Q m. We shall deduce that 


n<gm. Letting N denote fo, Lp wees n-1f and M denote 

fo, Ly waoy m-1}, we have, by Theoreml1, n= N and m= M. 
Since n Q m, each of 0, 1, .--, n-l precedes m-1 in the 
natural ordering, i.e., M = jo. Ly esos Moly cacy m1}. Thus 


NCM; and since NwN, we have N+vNcM. So the first half of 
the definition of n <om is satisfied. To show that the second 
half is also satisfied, we assume the contrary, l.e., MvN,CN. 
But NCM, so also NicM. But Ny # M, since m-1 g N and 
thus m-1¢N,. This contradicts Theorem 2 and establishes 
that n <o m. 

Part II. Suppose that n < m; we infer in Sy m as fol- 
lows. We know that for any natural numbers mandn, n <u m, 
or n=tm, or m a n. So it suffices to show that the second 
and third of these cases are incompatible with our assumption 
n <“ m. 

If n-=m, we have N = M; so, as cardinals, n =m = N = i, 
which, as we have remarked earlier, is incompatible with n < 
If m Q n, we apply Part I to infer that m < n, which is in- 
compatible with the hypothesis n <c m. 


31 
——— Theorem 4. If S is a set of sets among which there is none 


of greatest eardinal (i.e., such that, for each Ae S, there ex- 


ists A' ¢ S with A <A‘), then for each Ac S, US > A. 

Proof. We show that each half of the definition of US > A 
is satisfied. 

Part I. By the definition of US, ACUS, for any A« S- 
Thus A~ACYS. 

Part II. To show that for no Ai, USvA,CA, we assume the 
opposite, namely, that for some A,, US~Ai@A, and derive a con- 
tradiction. Since A'cUS, the 1-1 correspondence US~A, induces 
a 1-1 correspondence between A' and a subset Az of Aj. (The dia- 
gram and the discussion on p. 24 provide an earlier application of 
this same argument.) We then have A'wAsCA,CA, contrary to 


A<A'. 


We now define _ 
A, (alef null) = gop. HO Ailiaosat = 7 
Corollary. For every natural number n, n< o's . 
Proof. Let S be the set whose members are fo,1,+-+sn-1}, 
for n = 0,152 scr. . (Of course, when n = O, we understand that 
{0,1,+++an-1 }= g-) If As {o, dgevey WE 1} is any member of S, we 


tet at = fo, dg sings Then, by Theorem l, n=A and n+1=A', 


and by Theorem 3 and the familiar properties of the natural num- 


bers, A=n¢n+1 = At . The hypothesis of Theorem 4 is there- 


fore satisfied by our set S. Moreover, US = 2 Applying Theorem 4, 


we conclude that n=A < US =], for every natural number n. 


Theorem 5. Any infinite set M has a countably infinite 


subset A; thus we can express M thus: 


M = NU{a0,81,82,-++} = NUA, 
where A= lao cAaeBlages.» » the a's are distinct elements of M, and 


N=M-A. 


) Sets 32 
Proof. Since M is infinite, it is not finite. Therefore, 
M does not have a natural number as cardinal. In particular, 
i foe 2B, so that M is not empty. Let ao be a member of M, and 
consider the set M - { ao} - If M- {ao were empty, M would be 
the natural number 1, contrary to what we have observed above. 
Let a, be an element of M - fac} (which is clearly different from 
ao), and consider (M - faot) - fai} =M- fao,aa}. If this set 
were empty, M would be the natural number 2; so M - { ao.aa} has a 
member az, which is clearly distinct from each of ao and a,. We 
continue in this way to find, for each natural number n, an 
element ay? distinct from each of aog,a1;-- 


cs es 
Corollary 1. For every infinite set M, Bo <M. 


Proof, We recall that Ro <i means that No <M or 


Ro = M. By the theorem, we can write M=N Ulao,arsae,--.}= NUA. 


>i 
u 
% 


Here o «+ Now AvVACM. 
Case I. Suppose that for no set Ay, M~A,CA-. Then both 


-halfe of the definition of A <M are satisfied, and thus 


No =A <M. 

Case II. Suppose that for some A,, MvA,<@A. Then M may 
be enumerated as follows: by the 1-1 correspondence M~/A,, each 
element of M is made to correspond to an element of A,, which, 
since A, CA, is of the form aye for some natural number n. By 
listing the members of M in the order of the subscripts of the 
| corresponding a's, we obtain an enumeration of M. Thus M = ve 
t.e., M= Xo. 
The corollary to Theorem 4 states that Xo is an infinite 
| cardinal (since it is different from each finite cardinal n), and 
| Corollary 1 to Theorem 5 states that any infinite cardinal dif- 
ferent from Ro is greater than Rise Thus Xo is the least in- 


finite cardinal. 


 _ —_————— ie 


Sets 


33 
Corollary 2. Any infintte set M is equivalent to a proper 
subset of itself. 
Proof. Using the theorem, we can write 
M = NU {ova dimes +} =NUA, 
where N and A are disjoint. Now the set M, = Nujar.a2.as,.--} = 
NUA' is a proper subset of M, since ao g™M,- A 1-1 correspond- 
ence between M and M,; is obtained by making each member of N cor- 


respond to itself, and making ay € A correspond to a eaAt. 


n+l 

Corollary 2 to Theorem 5, just proved, shows that Galileo's 
Paradox (see pp. 13-14) is not a feature only of the particular 
set of the natural numbers. In fact, every infinite set is equiv- 
alent to a proper subset of itself (by this corollary), and only 
infinite sets have this property (by Theorem 2). Thus, this prop- 
erty is characteristic of infinite sets, and has even been taken 
as the defining property of infinite sets in some treatments of 
the theory. (This definition was first introduced by Dedekind in 
1888.) 


Corollary 3. A countably infinite set of elements 


Do »bisbe,+«++ can be adjoined to an infinite set M, or removed from 


a set if the resulting set M is infinite, without changing the 


cardinal number of the original set. 


Proof. Using the theorem to express M as: 
M = NU {a0,81,82,83,8¢,a5,+++[, 

the insertion of bo,bi,b2,-.. can be effected thus: 
MU{bo,bi,be,+++f = NU} a0,bo,A1,b1,a2,dare6 


j 


The 1-1 correspondence is obvious a 


We have defined, for any set A, 24 to be the set of all the 


subsets of A. We now define 


alae | 


Sets 


_54 

Theorem 6 (Cantor's Theorem). For each set A, A < Ce 
Proof. Part I. Let B be the subset of 2h whose members 
are the unit subsets of A; i.e., the members of B are the sets {al 
where a is any member of A. The correspondence between A and B 
which matches each aeéA with the unit set {a} é€ B is clearly 
1-1, so that AvBoeA, 

Part II. Assume, contrary to what we wish to prove, that 
aan Ay <A. Then we shall derimwenma contradiction to the following 
lemma, which will complete the proof of the theorem. 


Lemma. If C is a subset of 2A for which C~wA,CA, then 


Proof. We use Cantor's diagonal method, modified inessen- 


tially from the applications of this method used earlier in this 


section. Before proving the lemma for a general set A, we illus- 


trate the method by considering a specific set A {0.1,2,3,4}, 


a specific A, fo,1,2}, and a specific C whose members are 

cy = {22.5.4}, éy 5 {o.2}, and ca = {0.2.2}. The 1-1 correspond- 
ence between C and A, is indicated by the subscripts on co,¢1,Cce- 
We can represent any subset of A by a row of O's and 1's, 


corresponding to the elements of A, a O corresponding to each 
element which belongs to the subset, and a 1 to each element which 
does not belong. Thus the three subsets ¢o,¢1,¢g in our illustra- 


tion are represented by the three rows in the following tabulation: 


members of A ie) aye 2 3 4 
2 8 {1,2,3,4| a 0 0 0 
ec, = {0.2 \ o “1 0 1 1 
Ca = {0 1,2} fo) fe) Wg 1 1 


Sets 


c 35 

os 
In the 3x3 square of O's and 1's to the left o# our tabulation, 
we alter each diagonal element (see the arrows) by replacing each 
O by 1, each 1 by O. This leads us to a row of O's and 1's: 
0, 0, 1, -, - = 
(where the reader may supply a O or 1 for each blank in any way 
he chooses). This row represents a subset c of A which is dif- 
ferent from co because 
lg¢gex 


have secured by changing O's and 1's along the diagonal of the 


O ¢ co while O€ c, from c, because 


while 1 ¢€c, from cs because 2 € ca while 2 g ce (which we 
square). 

To recapitulate, c is constructed by excluding each element 
a of A, which is a member of the set to which a 


corresponds in 


the correspondence C-~A,, and including each member b of A, 
which is not a member of the set to which b corresponds in the 
correspondence C~A,,; while either excluding or including in any 
way at all the members of A - A,. The subset c of A differs then 
from each member d of C in respect to the membership or non- 
membership of the element of A, to which d corresponds in the 
correspondence CwA,. 

Described in these terms, the construction applies to the 
general case of the lemma. To "visualize" the construction in the 
general case, we have to imagine a "table" of O's and 1's with 
rows corresponding to the elements of C (where @ is any cardinal 
number) and columns corresponding to the elements of A, those 
corresponding to elements of A, coming first to form, with the 
rows, a “square.” 

We now have discovered a hierarchy of distinct cardinals 


in order of increasing magnitude, as follows: 


p 92 
Oh Dp 2y. wean Mey 52 4 PenglDy, Be BO a. wee 


a 


Sets 


36 
First we have the natural numbers 0, 1, 2, ..-., which appear in 
the familiar order by Theorem 3. Then next we have > by the 
corollary to -_ 4. Thence we get successively greater car- 
X as 


° 
oF 22 » +++, by applications of Theorem 6. After 


dinals 2 
all these cardinals, we get a still greater one, say p, by using 
Theorem 4; and so on. Thus Cantor showed that there is not just 
one kind of actual infinite, but a never-ending hierarchy of in- 
creasing infinite cardinals. 

We know that Xo is the least infinite cardinal and that 
es, <2 °o, The following question arises: is ao the next 
greater cardinal after nw? This is the celebrated "continuum 
problem" of Cantor, so called because oo may be shown to be the 
cardinal number of the set of real numbers, often referred to as 
the linear continuum (see the end of the section}. The "continuum 
problem" has never been solved, but it has been shown by Gédel 
(1938) that an affirmative answer to the question (1i.e., the 
assumption that no cardinal lies between Yo and gry is con- 
sistent with the other asg§gumptions usually taken as axioms for 
set theory (see Sec. 6).% 

For any two sets A and B, there are exactly four cases re- 
garding mutual equivalences: 

Case I. For some By, AvBicB; and for some Ai, BvA,cA. 
Case II. For some B,, AvB,CB; but for no Ay, BvAiCA. 
Case III. For no By, A» BycB; but for some Ay, BvAiCA. 


Case IV. For no By, AvBicB; and for no Ai, BYAVCA. 


3a stimulating discussion is given in Gddel, K., What is Can- 
tor's continuum problem?, American Mathematical Monthly, v. 54, 
pp. 515-525. 


| 


Sets 37 
In Cases II and III, our definitions give immediately 
i <8 and A>3B, respectively. By definition, A=B if A~B, 
which comes under Case I by taking B1 =B and A, = A; but Case I 
as stated is a more general condition than Aw~B, in that A may be 
given as equivalent to a proper subset of B and vice versa. The- 
| orem 7 below states that in Case I, in fact AWB, so K = 8. 


Theorem 7. (Equivalence Theorem, F. Bernstein, 1897). 


If for some B,, AwB,<B, and for some Ai, BwAicA, then A~B. 


We omit the proof of this theorem, which could be given 
here in two pages. For two proofs, see Fraenkel, loc. entity 


pp. 98-103. 


wil 


Corollary. If A<B, then A < 


Proof. Since AwAcB, only Cases I and II are possible, 
and the theorem tells us that Case I gives A= B. 
| Case IV turns out to be an impossibility, as shown by 


Theorem 8. (Zermelo's Comparability Theorem, 1904). 


or there is an Ay such that BwA,cCA. 


The proof of this theorem is deep, depending upon Cantor's 
theory of "well-ordered" sets, and Zermelo's proof by means of his 
"Axiom of Choice" that every set can be "well-ordered." The proof 
is given in Fraenkel, loc. cit.1, p. 315. 


Corollary. For any two cardinals AR and 8, either A <B 


or A=B or A>B. 


Proof. By the theorem, Case IV is impossible; and, as we 


have already noted, Cases II, I, and III give A < B, A= B, and 
A > B respectively. 

By combining this corollary with the properties of cardi- 
nals listed as (i) and (ii) on p. 23, we find that any set of car- 


dinal numbers is linearly ordered. 


Sets 38 
By using the above theory (including Corollary 3 to Theo- 
rem 5 and the corollary to Theorem 7), it is not hard to show that 
the set of real numbers, the set of 1-place number-theoretic 
functions and the set 2 of the sets of natural numbers all have 
the same cardinal, which we have called go in connection with 
the last example 2!). The set of real numbers geometrically is the 
real line. It can be shown that this set is equivalent to the 
real plane; so the plane (as a set of points) also has the cardi- 
aa *o. The set of all plane point sets then has the cardinal 
¥) 
a . This illustrates that most infinite sets of objects com- 
monly used in mathematics do not have cardinals very far up in the 
N 2 No 


hierarchy No, 2 °, 2 fai oe 


5. The paradoxes 

Cantor's development of the theory of abstract sets is 
called "naive" set theory to distinguish it from axiomatized set 
theory which will be discussed in Sec. 6. In using Cantor's 
"definition" of a set (p. 1), we are guided by intuition in decid- 
ing which objects are sets and which are not. The relation be- 
tween mathematics and set theory was like the course of true love, 
it did not run smooth. From the beginning, there was resistance 
to Cantor's use of the actual infinite. Just when his ideas were 
well on the way toward prevailing, in the 1890's, contradictions 
appeared in the upper reaches of his set theory. Nevertheless, 
since then set theory (suitably adapted) has increased its place 
in mathematics, while the paradoxes have focussed attention on the 
foundations of set theory and of mathematics generally. 

The Burali-Forti Paradox which appeared in 1897 (but was 
known to Cantor in 1895) arises in Cantor's theory of ordinal num- 


bers, which we have not discussed. 


= gual lia | 


Sets 39 

Russell's Paradox (1902-3) concerns the set of all sets 
which do not contain themselves, and is comparatively familiar 
from the following popularized version: The barber in a certain 
village shaves all and only those persons in the village who do 
not shave themselves. Question: does he shave himself? 

We now give in detail a third paradox of the theory of 
sets, the Cantor paradox (1899). Let M be the set of all sets. 
Then 2M is a set of sets, and, as such, Mem. So ou x M. But, 
by Cantor's theorem (Theorem 6), gu > M. ‘Thus we have a contra- 
diction. 

A somewhat different type of paradox, concerning finite 
definability and employing Cantor's diagonal method, is the 
Richard Paradox (1905), which runs as follows. 

By a "phrase" we shall mean any finite sequence each of 
whose members is either a blank or one of the twenty-six letters 
of our alphabet. Thais, "abacadabra", "of cabbages and kings", 
"the square of n", and "xtu rltp" are phrases. We can enumerate 
the phrases by the "method of digits," using a 27-digit number 
system to represent the natural numbers. Certain phrases, such as 
our second example "the square of n",describe in the English lan- 
guage l-place number-theoretic functions. We now strike out from 
our enumeration of all the phrases those which do not describe 
such functions; thereby we obtain an enumeration Py, Py Pa, ++. 
of the phrases which do. Say the functions described are f,(n) 
fi(n), fo(n), ..., respectively. 

We now consider the following phrase: "the function whose 
value for each natural number n is cbtained by adding one to the 
value for n of the function described by the phrase corresponding 


to n in our enumeration of the phrases which describe one place 


Rr 


Sets 40 


number theoretic functions". In this phrase, we could replace the 
the last part "in our enumeration of the phrases which describe 
one place number theoretic functions" by a detailed description 
of the exact construction of the enumeration, and so obtain an- 
other phrase fully describing the same function. 

This phrase describes a number-theoretic function, namely, 

f(n) = £,(n) +1. 

Hence it occurs in the enumeration Po. Pi, Pa, --. + But the 
function described differs from that described by P, in the value 
for n = 0, from that described by P, in the value for n =1, 
from that described by Pa in the value for n = 2, and so on. 
Otherwise expressed: Since this phrase occurs in the enumeration 


of Po, Pi; Pa, vy, Et is Py for some q- Then 


f(n) = £4(n). 


Contradiction arises by substituting q for n in this and the last 
equation. (In Richard's original version, real numbers were used 
instead of 1-place number-theoretic functions.) 

This paradox is closely connected with the facts that, on 
| the one hand, only a countable infinity of number-theoretic func- 
tions are describable in a given language (because the set of all 
the phrases in the language is only countably infinite). while, on 
the other hand, the set of all the number-theoretic functions 
is uncountable (by Cantor's diagonal method, See. 3). 

A related paradox was given by Berry in 1906. Consider the 
expression "the least natural number not nameable in fewer than 
twenty-two syllables." This expression names a definite natural 
number, say n, since each non-empty set of natural number (in this 
case, the set of natural numbers not nameable in fewer than twenty- 


two syllables) has a least element. Thus n is not nameable in 


eee 


| sets 4. 


fewer than twenty-two syllables. But our expression naming n has 
in fact exactly twenty-one syllables! 
The modern paradoxes are related to the paradox of "The 


' which comes from antiquity.* The following statement is 


Liar," 
attributed to the Cretan philosopher Epimenides, sixth century 
B.C.: "All Cretans are liars." Let us suppose that by "liar" 
Epimenides meant a person who never tells the truth. 

Suppose his statement is true; then by what it says and by 
his being a Cretan, it is false, which is a contradiction. So by 
reductio ad absurdum, the statement is not true, i.e., it is false. 
This means that at some time some Cretan has or eventually will 
tell the truth. This, however, should be a matter for the histo- 
rian to decide, i.e., it should not be demonstrable on logical 
grounds only, as we appear to have demonstrated it. 

The direct form of the paradox of "The Liar" was given by 
Eubulides in the fourth century B.C. We can give it as follows 
“The statement I am now making is a lie." One sees directly that 
the quoted statement cannot be true and that it cannot be false. 

In the ancient "dilemma of the crocidile," a crocodile has 
stolen a child, but offers to return the child to his father, if 
the father can guess whether or not the crocodile will return the 
child. If the father guesses that the crocodile will not return 


the child, the crocodile is in a dilemma. 


| A missionary, fallen among cannibals, discovers that he is 
| about to become their supper. They offer him the opportunity to 


make a statement, under the conditions that if the statement is 


*Historical details and discussion are to be found in Weyl, H., 
Philosophy of Mathematics and Natural Science, Princeton University 
Press, Princeton, 1949 (see p. 228). 


Sets 


42 
true, he will be boiled, and if it is false, he will be roasted. 
What should the missionary say? 


Axiomatic set th vy 


Cantor's end the other paradoxes of set theory show the 
difficulties inherent in the attempts to develop the theory on 
a naive or intuitive basis starting from Cantor's definition of 
set. These difficulties pose the problem how to modify set the- 
ory so that contradictions do not arise. In fact, the probiem 
goes further, and forces us to ask ourselves wherein we were de- 
ceived by the methods of constructing and reasoning about objects 
which had seemed convincing before they were found to eventuate 
in paradoxes. Complete agreement among mathematicians on the 
cause of the paradoxes and the cure has not been attained yet 
(1956), and it seems problematical that it ever will. 

In this section we shall consider mainly the least radical 
reformulation of mathematics to avoid the paradoxes. This con- 
sists in observing that the paradoxes of set theory are associ- 
ated with using "too large" sets, such as the set Mcf all sets 
which we used in Cantor's paradox. Since unrestricted use of in- 
tuition starting with Cantor's definition of set led to the dif- 
ficulty, Zermelo proposed in 1908 to restrict the sets to those 
provided by a list of axioms. These axioms are drawn up so that 
there is no apparent means to derive the known paradoxes from 
them. On the other hand, the axioms do suffice for the deduction 
of the existing classical mathematics, including abstract set 
theory, short of the paradoxes. 

We give now, in our words, the list of axioms or princi- 
ples which appear in Fraenkel, loc. cit.,+ with the pages where 


they are to be found. Choosing this list of axioms has the ad- 


Sets 


43 
vantage that Fraenkel's exposition is built around them, and 
Fraenkel plans a new volume,° which presumably is to treat the 
subject in connection with these axioms. 

(I) (Axiom of Extensionality) p. 21. Two sets A and 
B are equal if and only if they contain the same elements; i-e., 
A =B<— > (AcB and BCA). 

(II) (Axiom of Subsets) p. 22. Given a set A and a prop- 
osition P(x) meaningful for the elements of A (i.e., for each 
x € A, either P(x) is true or P(x) is false), there exists the 
set {x € a | P(x)} containing exactly those elements of A for 
which P(x) is true. (This Axiom is also called the Axiom of Se- 
lection, the Axiom of Segregation, or the Aussonderungsaxiom. ) 

( IIT) (Axiom of Pairing) p. 24. If a and b are dif- 
ferent objects, there exists the set fa,b} containing exactly a 
and b. 

(Iv) (Axiom of Union) p. 28. Given a set S of sets, 
there exists the set US containing just the elements of the ele- 
ments of S. 

(v) (Axiom of Infinity) p. 42. There exists at least 
one infinite set: the set {Cidlation up of the natural numbers. 
(Fraenkel used {a yBySy nee? a) 


(VI) (Axiom of the Power Set) p. 97. Given a set A, 


there exists the set 2A whose elements are all the subsets of A. 
(VII) (Axiom of Choice) p. 123. Given a disjointed set 
S of sets, there exists a set c which has as elements one and 


only one element from each member of S. 


“Fraenkel, A. A., Foundations of Set Theory, forthcoming. 


acacia: 


Sets 44 
We illustrate how the sets we introduced in Secs. 1 and 4 


are provided under these axioms. 


(1) fa} = {xe {a,b} | x =a} (2II), (11) 
(1i) 6 = {xe {a,b} | x# x} (tr), (1) 
" (441) AUB = u {a,B} (IZ), (Iv) 
(iv) If a,b,c, are distinct objects, 
{a,b,c} = {a,b} U {ef (Tit), 4), Gr 
(v) ANB = jxea|xe Bi (x) 
(vi) As = {xe Ai | for all ¥,YeS—>xe Y}, 
where A, is some fixed member of S$ (II) 


(vii) As our last illustration, we obtain the Cartesian 
product AXB, il.e., the set of all ordered pairs (a,b), where 
aeA and beB. Recall that (a,b) = {a,b}, jay} Thus each 
member (a,b) of AXB is a set, whose members (namely, ta,b} and 


fa} ) are subsets of AUB and therefore belong to ghUB. So the 


phUB pAUB 
members (a,b) of AXB are members of 2 g Basu AK Bree 2S 


| and we can apply the axiom of subsets thus: 


AUB 
AKB = [xe 2? 


| (for some a andb, ae A and beB and 
a #b and, for all y, y ¢ x if and only if y = ja.b} or y= ja}) 
or (for some a, ae A and ae B, and for all y, y €x if and 
only if y = (aj). 

A form of the Axiom of Choice was first explicitly noted as 
an assumption in Zermelo's 1904 proof of his "well-ordering theo- 


rem," 


which entails Theorem 8. The form given here, which is 
Russell's "Multiplicative Axiom} does suffice for the derivation 
of Zermelo's form(and vice versa). A simple application occurs in 
the proof of Theorem 5 above. 


The Axiom of Choice has been the subject of much research 


Sets 


45 
with a view to minimizing its use, singling out its consequences, 
or (Gédel 1938) defending it as an assumption which can be con- 
sistently added to the other axioms of set theory if those axi- 
oms by themselves are consistent. 

. At one point these axioms (as did the axioms given by 
Zermelo in 1908) lack definiteness. This is in (II), where the 
notion of a proposition P(x) meaningful for elements x ¢ A is 
incorporated. This lack of definiteness was first remedied by 
Fraenkel (1922) and somewhat differently by Skolem (1922). What 
is required is a specification of a class of admissible proposi- 
tions P(x). Rules for constructing these P(x)'s can be formulated 
in connection with the specification of the symbolism of a lan- 
guage in which the deductions from the axioms are to be carried 
out. 

The specification of the symbolism of a language, which is 
necessary for the purpose of being exact in logical deductions, 
comes under the subject "Logic" of our second chapter. The 
Richard paradox and the paradox of "The Liar" show that care is 
necessary in this. That is, the language of a mathematical theory 
must be subjected to rules governing the formation.of propositions 
somewhat akin to the rules listed above governing the existence 


of sets. 


Chapter ff 
LOGIC 
The propositional calculus (model theory )5 validity 


Logic has a crucial role in the axiomatic treatment of 
mathematics, for logic must supply the deductions. That is, it 
is up to logic to determine the circumstances under which a prop- 
osition is a theorem of a mathematical theory, after the axioms 
for the theory have been selected. 

Now there can in fact be different systems of logic (e.g., 
"classical" and "intuitionistic"), just as there can be different 
systems of axioms for geometry (e.g., "Euclidean" and "non- 
Euclidean"). With the same axioms, different theorems will result 
from the adoption of different systems of logic. 

Fortunately for us, since we have only limited time to de~- 
vote to the topic of logic, we can concentrate on a particular 
system which encompasses a good part of the deductions used in 
modern mathematics. This is the (classical) predicate calculus 
(of first order). We begin by studying a part of this system, 
called the (classical) propositional calculus. 


Logic says what inferences from propositions to other prop- 


ositions are valid on the basis of the form of the propositions 
concerned. (If one wants to be meticulous in the terminology 
here, one may say “the form of the sentences expressing the propo- 
sitions.") Before we can set up a system of logic, we must have a 
set of syntactical rules for the language (or part of some lan- 
guage) in which are constructed the propositions (or sentences) 
whose logic we are to treat. Complete examples will be given 
later of the construction of a mathematical language. 


Here, however, in order to proceed directly to the proposi- 


Logic 


47 
tional calculus, we shall assume that part of this task of lan- 


guage construction has already been carried out. That is, there 
- Lie 


is given a of propositions constructed under some syntactical 


rules which make it unambiguous when a proposition belongs to this 
set and when two propositions of this set are the same or distinct. 
We further require that the propositions of this set are each not 
of one of the forms A <—>B, A—>B, A&B, AVB, or~ A, where 
A and B are propositions of the set (see below for the symbols 
<>, —., & \V, and~). We call these propositions prime 


formulas (for the propositional calculus), and denote them by 


capital letters P, Q, R, «++, Pi, Poa, Ps, eee, from the second 

half of the alphabet. In this case of prime formulas, different 
letters appearing together in a discussion shall stand for dis- 

tinct propositions. 

Starting from these prime formulas, we generate the class 

of propositions to be called formulas by applying, repeatedly in 
all possible ways, four binary operations and one unary operation, 
thus: If A and B are formulas, so are each of the following: 
A <—> B (read "A equivalent to B"), A —> B ("A implies B"), 
A&B ("A and B"), AVB ("A or B"), ~wA ("not A"). Formulas 
generally (not necessarily prime) we denote by capital letters A, 
B, C, ... from the first half of the alphabet, different letters 
not necessarily representing different formulas. Formulas not 
prime are called composite. (In the literature, other symbols 
are often used, the common alternatives being: A2B for A — B; 
AAB or AeB for A&B; 7A or A for~A; A=B or AvB for 
A <-> B.) 

In repeated combinations, parentheses are used to avoid 


ambiguity, e.g., to distinguish (A &B)—>cC from A & (B—>C). 


Logic 48 
To avoid an excess of parentheses, we use conventions, as in alge- 
bra. Thus <—> shall be strongest (1i.e., it is to encompass most), 
—> is next weaker, and ~ weakest. This should not be too hard 
to remember, as <——> when present (and otherwise —>) will usually 
Bs the principal symbol, like the = in an equation. Then ~ is our 
only unary operator, like - used to form the negative of a 


number, or #2 


as an exponent. 
Examples of the use of these conventions are given below, 


with analogous examples from ordinary algebra to the right. 


A&B-—>c means (A&B)—>C a+t+b=c means (a+b) 


¢ 


not -A & (B —> Cc) not a+(b=c) 
A<>B—-> C means A<> (B>C) a+b-+c means a+ (b=: c) 
not (A<>B)— C¢ not (a+b):e¢ 
wAVB means (“A)VB -a +b means (-a) +b 
not ~(AVB) not -(a +b) 
a:b? means a: (b)? 


not (a: b)? 

The propositional calculus treats of those logical rela- 
tions among formulas which depend only on how the formulas are 
composed out of the prime formulas. Thus it does not take into 
account the internal structure of the prime formulas, but only 
their individuality, i.e., the possibility of recognizing them 
and distinguishing between different ones. (This can be enforced 
by treating the letters P, Q, R, .-. Py, Pa, Ps, -.. themselves 
as being the prime formulas. They are then called proposition 
vewtabtes, i.e., each standing for an unspecified prop- 
osition. ) 


The classical propositional calculus, which we treat, now 


Logic 49 
makes the assumption that each prime formula is either true (t) 
or false (f), but not both. But since in the propositional cal- 
culus we do not look at the internal structure of the prime formu- 
las (or do not specify the propositions represented by ‘the propo- 
sition ), we do not know which is the case. 

We now define how the truth value, "true" or "false," of 
the composite formulas depends on the truth values of the prime 
formulas. This is accomplished by tables which give the value of 
A< >B,A—>B, A&B, AVB, andwA when entered from the 
values of A and B, as follows: 


A implies B 
If A then B 


A equivalent to B B af: & A or B 
A if and only if B A only if B Aand B_ (inclusive) 
| A_B A<—7B i> B A&B AVB 
t + t t t 
% ey Fy t 
& £ £ t 
£ Ff t f bg 


These tables should be learned, not however by rote, but by think- 
ing through that the table renders the meaning suggested by the 
names written above in English. Where the English is ambiguous, 
note how the table resolves the ambiguity within our limitation 
to choosing either a t or an f in each case. 

Now consider any formula E containing exactly the 
prime formulas Pj, .-., Po We shall call these P's the prime 
components of E. How the truth value of E depends on the truth 
values of Py, .-.-,; Pa can be exhibited in a truth table, entered 
from each of the 2” possible assignments of t's and f's to 


Pays, wiayeg Bie Here are several examples. 


ogic 50 


Ba: 
t 
£ 
. IV Vv 
Py Pa Py -—> Pa ~Pi V Po 
¢ t t t 
t f £ £ 
£, t t t 
f e t t 
VI VII 
Pi Pe Ps P) —> (P2 > Ps) (PiV Po) & (Pi > Ps) 
t & & t t 
t t £ £ ft 
t £ ¢ t t 
6 of t f 
ft & t £4 t oP © se 
ft Ff t t 
ree ¢ t big 
££ f t £ 


To illustrate the computation, take Line 5 of Table VII: 


(Pi V Pe) & (Pi — Ps) 


(4) (@vt) @ (¢ —>t) 
(44 + & t 
(iit t 


For P,, Pe, Pg we substitute the values of f, t, t respectively 
taken from Line 5; this gives (i). In (11) we have evaluated the 
fvt by the table forV, and the f —>t by the table for —. 
In (ili), we have evaluated the t & t by the table for &. Space 
can be saved by not writing the v, & —> in (i)-(iii) and tele- 
scoping (i), (ii) (111) together vertically. The result is Line 5 


as written in Table VII. This computation illustrates the defini- 


Logic 


521 
tion of the truth table; but often shortcuts can be made. Thus 
in Table VI, the observation that A —> Bis t whenever Ais f 
(by the table for —>) gives in one step t's for Lines 4-8. 

Now we ask: what formulas are true on the basis of 
the propositional calculus alone? The answer is given by re- 
flecting that in the propositional calculus the truth or falsity 
of the prime formulas is unavailable to us. So in the proposi- 
tional calculus we are in a position to say that a formula is 
true when and only when its truth value will be t whatever the 
values (each either t or f) of its prime components P,, ..., Ba 
i.e., when the value is t for each of the 2” assignments of t's 
or i's. to Py, esis, Pye i.e., when (the value column of) its truth 


table consists entirely of t's. In this case, we call the formula 


valid (in the propositional calculus), or it may be called iden- 
tically true or said to be a tautology. For example, looking at 
Table III, P; —> Pi is valid. None of the other six formulas is 
valid; e¢.g., by Table VI, P; —> (Ps —> Ps) is not valid, since 
(by Line 2) it is f when P,, Pa, Ps are t, t, f, respectively. 

To give a verbal example, the proposition "If I am 
going too fast, then I am going too fast" is true on the basis of 
the propositional calculus (it is valid, by Table III). But the 
proposition "I am going too fast", if true, is true on other 
grounds (it fails to be valid, by Table I). 

It might seem that the valid formulas or tautologies 
are the least interesting, because from one point of view they 
give no information. My admitting that "If I am going too fast, 
I am going too fast" can hardly give any of you much satisfac- 


tion. But it will appear as we proceed that the tautologies are 


very important. 


Logic 


52 

The definition of validity provides us with an auto- 
matic way of deciding as to the validity of any proposed formula: 
simply compute its truth table, and see whether we get all t's. 
This is a very fortunate situation, and one should not hesitate to 
do’ this in any case of doubt. 

However, computing truth tables of formulas at random 
would be a rather siow way of discovering valid formulas. Anyone 
not actually familiar with simple examples of valid formulas, and 
with methods for proceeding to others (whether or not he has offi- 
cially studied logic), would properly be described as sluggish in 
his mental processes. 

One simple principle is this. In defining validity, 
we used a truth table entered from the prime components, so as to 
take into account all the structure of the formula available to 
the propositional calculus. However, to establish validity, we 
may not need to dissect a formula all the way down to its prime 
components. If we get all t's in a table entered from components 
not necessarily prime, we can be sure it is valid. For example, 
(P &vP) —> (P &vP) is of the form A —> A; Table VIII entered 
from A gives all t's; hence the formula is valid. For in comput- 


ing Table IX entered from P, the first part of the computation 


VIII Ix 

— A ep | (P&P) —> (P& ~P) 
t t € | tere t eer t 
PF, CRE £ t 


consists in computing the value of P&P i.e., the A, and then 
the rest of the computation in computing the value of the whole 
from that of A (shown underlined in Line 1 in Table IX) which we 


have already done with result +t in computing Table VIII (Line 2). 


Logic 


53 
Table VIII is the same as Table III except for the notation; in- 
stead of saying we construct a table for A —> A entered from A, 
it comes to the same thing to say that we verify the validity of 
Py —> Pi, and then substitute A, l.e.. P &~wP, for Py in it. 
This reasoning gives the following theorem, in which we write 
"ER" as a short way of saying "E is valid." 
Theorem 1. (Substitution Rule). Let E for a for+ 


* 
mula having Py, ..-, Pa as prime components, and let E come from 


E by substituting formulas Ai, ...,; OL Pais, sews Py» respec- 
tively. If FE, then fe. _ 

On the other hand, to show by truth tables that a 
formula is not valid, the tables must be constructed from the 
prime components. For example, P &v~P —>Q is of the form 
A—>B. A table constructed to be entered from A and B would 
not have all t's (in other words, Py —> Pz is not valid). But 
P &“P —> Q is valid, as can readily be seen from the fact that 
the table for P &~P has all f's (see Table II), and thus only 
the value f for A would be used in the table for A —B. 
These remarks show that, in Theorem 1, we cannot conclude that 
aig EE’, then [-E. (However, in the special case that a table 
not constructed from prime components gives all f's, we would be 
justified in concluding without further dissection that the for- 


mula is not valid.) 


Logic bert 
Suppose that the truth table for a formula E is constructed 
as above by using exactly its prime components P,_..-. Py and 
suppose that a new table is constructed for E by using addition- 
als prime formulas Putl, ordre iy Poem not in E. Then the new table 
differs from the original only in the fact that each value column 
of the new table splits into et parts, corresponding to the re 
assignments of t's and f's to the prime formulas Patl,*°°? P in 
which do not occur in E. Each of these an parts is a duplicateof 


the value column of the original table, since the same computation 
(based only on the assignments to Pi, .-., BE) is used in each 
part. For example, with n =2 and m= 1, Tables X, XI and XII 
below have been constructed by entering from three prime formulas, 
although the formula at the head of each table contains just two 


prime formulas. 


VII xX 


pS 
a 


XII 


Py P2 Ps (PiVP2) & (Pp > Ps) PeVPs Pi & Ps Pe —> FeV Ps 


& 
t 
£ 
t 
= 
t 
t 
f 
bi 


reetctict Peet ct ct 
rorh hy hy ract hy ct | & 


Rr octct ett 
He ct ct ry bo ct ct 
yet Py ct eect hy et 
tetctet ett ct ct 


In Tables X and XII, Lines 5-8 (P, is f) are duplicates 
respectively of Lines 1-4 (P, is t), and in Table XI, Lines 3, 4, 
7, 8 (P2 is f) are duplicates respectively of Lines 1, 2. 5, 6 
(P2 is t). (Table VII is repeated here for comparison with 


Tables X and XI later.) 


In particular, if the table for E when constructed only 


Logic 55 
from its prime components, contains only t's, then the addition 
of one or more prime formulas not in E to the table will result 
in a new table containing all t's. (This case is illustrated by 
Table XII.) For Theorem 2, in order to compare the truth tables 
for A and for B, they must be constructed on the basis of the 
same list Py,e.e.4, Pa which must contain at least the prime for- 
mulas occurring in either A or B, but by the preceding remark 
there is no harm if the list contains more prime formulas. 

Theorem 2. EA <> 3B if and only if A and B have the 
same truth table. 

Proof. In the computation of the value of A <—>B fora 
given assignment of t's and f's to Pay 2 vig Pye the first part 
consists in computing the values of A and B, after which the com- 
putation is concluded by applying the table for A <—>>B. From 
this table we see that A<—> Bis t if and only if the values 
computed for A and B are the same. 

Example: By Tables IV and V (p. 50), Pi —> Pe <—>».PiV Po 
is valid. 


Corollary. (Replacement Rule). Let Cy be a formula con- 


replacing this occurrence of A by B- If - A <—>B, andl= C 


then — C3: a 


Proof. We have to show that Cy has only t's in its truth 
table. By assumption, Cy has only t's in its truth table. By 
the theorem, replacing, in the computation for any line of the 
table, the computation of the specified part A by a computation 
of B instead, will not change the outcome. Thus CE will likewise 


have the value t in every line of its table. 


Theorem 3. If - A and -F_A-—B, then & B. 


| Proof. In constructing the table for A —> B, consider 
any assignment of t's and f's to the prime components P,, ..--, PL 
of A and B. The computation of the value of A —>B consists 
in first computing the values of A and B, and then computing the 
value of A—>B from these by the table for —>. By the hy- 
potheses that — A and - A —>B, both the value obtained for 
A, and the final value for A—>B are t. From the table for 
A — B (see p. 49), it is clear that this can only be the case 
if the value obtained for B was also t. Since this is the case 
for each assignment to Py, ..., Pu? B receives the value t 
for all assignments, i.e., B is valid as was to be proved. (If 
A or B contains letters not in the other, the tables for A or B 
which are involved here are entered from more than the minimum 
number of prime formulas. By the remark preceding Theorem 2, 
this is harmless. ) 

As the next theorem, we give a collection of valid formu- 
las. The reader is not expected to learn the list outright, but 
| rather to use it for reference, and in doing so to become famil- 
iar with those most frequently used. Each of the formulas can 
be proved valid by constructing for it a truth table entered from 
A, B, C, which is justified by Theorem 1 (p. 53). For example, 
by Table III (p. 50) (with Theorem 1), we get 11. By the example 
following Theorem 2, we get 36. The Venn diagram proofs for 1, 
It, By 2%, Sy B41, Wy LO; 12y WV", 135 13%, le, 12" (ps 8) can 
be eenstrueted as truth table proofs for 22-35, if we translate 
the Boolean symbols into logical symbols as follows: U becomes 
Vv, (becomes & = becomes <—>, “becomes ~. As an illus- 


tration for the corollary to Theorem 2, we can prove 37, assuming 


] 


Logic ST 
Theorem 4 

(Introductions and eliminations of logical symbols ) 

1. - A— (B— A). 

2. & (A—>B)—> [a (B—c)) > (a—> 0). 

3. & (A>B)> [(a>~B) >~al. _ 4. jpvwA—> A. 

5. — A—>(B—> A&B). ~6a. & —A 


A&B , 
6b. — A& BB. 


-Ta. = A —> AVB. 


Tiare ATRAVB. a. & (a> 0) > [(B> 0) > (avB> 6)]. 
9. & (A>B)> (Ba) > (ac-pB)I. loa. ie (ac-38} ao (a->B). 
1l0b. & (A<>B) > (BA). 


(Principle of identity, chain inference, importation and exportation, 
interchange ) 


ll. = A—D>A. 2. - (A>B)> [(Boc)>(Aardo)). 


13. & [a —(Boc)]<o[aes oc]. 14. & [ao(eoc)]<>[B>(4>0)]. 

(Contraposition, laws of double negation, excluded middle, 

non-contradiction) 

15. f(A —> B) <=> B> ~A).- 16. wwk. GO AL 

17. & AVA. 18. = ~(A &~A). 

(Reflexive, symmetric, and transitive properties of equivalence ) 

19, FE A&A. 20. F&F (ADB) <> (BS A). 
21. &— (A> B)& (BSC) > (ASC). 


(Some Boolean algebra relations ) 


22. |= AVB > BVA. 23. - A&B DBEA. 

24. E  (AVB)vC <> av(Bvc). 25. = (A&B) &CKHD>A& (BEC) 
26. & AV(B&C)<—>(AVB) & (AVC). 27. = A&(BYC) <> (A&B)V(A&C). 
28. AVA DA. 29. - ABA >A. 

30. - AV(A&B) << >A. Sl. lt A& (AVB) OA. 

32. = ~(AVB) <> ~A & oB. 33. & ~(A&B) << >~AV~B. 

34. f= AVB <> ~(~A &~B). 35. — A&B << >~(~A\V~B). 
(Expressions for some symbols in terms of others; see also 354, 35) 

36. &- A —>B<—>~AVB. 37. -— A—>B<>(A & ~B). 

38. — A&BG>(A —>~B). 39. - AVB<—>~A->B. 

40. tf (A <—>B) <> (A—B) & (B— A). 


Logic 58 


16, 34, 36 already proved, thus: First, substituting ~A for A in 
34 (which is permissible since the A there is any formula), we 

get |= vAVB <—>~(v~A & vB), which we use as the "| A <> B 
of the corollary. Next, using 36 as the Le e,", and applying 

the corollary, we get FA—>>B<—~~ (“A &B). Finally, we get 
37 by applying the corollary again, this time using 

bk ABS >n(~~A &~B) as the "F ¢," and 16 as the 

"Ee A <—>B". 


Theorem 5. Let E be a formula composed from 


Pasereeey Pus wPis tera wP, by use of only & and vy. Let gt be 


the result of replacing each occurrence of & by V and vice versa 


and replacing each occurrence of Py not preceded by w by apy and 


vice versa. Then - “E <-> El. 


Proof. We start with FwE <-> ~E (by 19), and transform 
the right side by using the corollary to Theorem 2, 32, and 33 
to move the ~ of wE toward the right (i.e., inward) progressively 
across the 's and V's (which interchanges them). Finally, we 
use 16 to, Be nan stens Py (not preceded by ~~) and the vP,. The 


following example illustrates the procedure. 


k ~((oPvQ) & (P& (~Pva))) > w((~PvQ) &(P& (~Pva))) 19 
<—> ~(~Pva)ve( & (~PVQ)) 33 
<> (ewP & ~Q)V PVA~PVG)) 32,35 
<> (woP &VQ)V CPV(-~P &¥Q)) 52 
" <> (P&Q)VGPV(P&~Q) 16 


TT OT OT 


The negation wE of a general formula E can be "evalu- 
ated" as follows: We eliminate —> and <— > from E by the use 
of 36 or 37, and 40, and the replacement rule, obtaining an 


equivalent formula E,. Then Theorem 5 can be applied to move 


| Logic 59 
each’ which is not immediately preceding one of Pj, ..., Py 
inward, each time working on such a ~ which has no other such 
inside it. 


8. The propositional calculus (model theory); valid consequence 


We started this chapter by saying that logic would have th 
very important function of saying what follows from what - thus, 
of saying what propositions are theorems for given axioms. Yet 
thus far we have only dealt with valid formulas - or tautologies - 
which logic asserts to hold without regard to any extra-logical 
assumptions whatsoever. 

Still keeping in mind that in the propositional calculus 
we cannot look at the internal structure of the prime formulas 
(or will not know the propositions which the proposition Variable: 
represent), let us suppose that we are given from outside the 
propositional calculus that a formula A is true by assumption or 
fact. That is, we may be told that it is an axiom of some ab- 
stract theory (like geometry or group theory), so it is true by 
fiat for the purposes of a given mathematical theory. Or it may 
be a proposition which is true in fact or by intuitive mathemati- 
cal reasoning. How does this alter our position with regard to 
what formulas we can assert to be true by use otherwise of only 
the propositional calculus? 

Consider’a concrete case; say A is (Pi P2) & (Pi —> Ps) 
(Table VII, p. 54). Remember that what Pi, Pz, and Ps really are 
is top-secret information, and practitioners of the propositional 
calculus are not cleared for it. Nevertheless, if we are told 
that (P; V Po) & (Pi; —> Ps) is true, we have been told something 
Namely, we know that the truth values of Pi, P2, Ps must form one 


of the four assignments (Lines 1, 3, 5, 6) which give t for 


Logic 60 


(Pi VPs) & (Px —> Ps) in Table VII. So now, in trying to decide 
what other formulas B are true on the basis of the propositional 
calculus plus the information that A is true, we need only con- 
sider these four assignments. Thus, upon being given that A is 
true, Po VPs is true because its Table X has only t's in Lines 
1, 3, 5, 6; but we still do not have enough information to know 
that P, & Ps is true, because its Table XI has f for example 
in Line 5. Of course. Ps —> PeV Ps is true (Table XII) but 
we knew this already without use of the information that A is 
true (1.e., Ps —> PaVPs is valid). 

This leads us to the following general definition. Con- 
sider two formulas A and B, and let P,, ---,; Pa be the prime 
formulas occurring in A or in B. We say that B is a valid 
consequence of A (by the propositional calculus), or that the 


inference from A to B is valid (by the propositional calculus), 


or, in symbols, A [| B, if, in the truth tables for A and B 
constructed to be entered from Py, cee, Py B has the value t, 
not necessarily for all lines, but at least for all those in 
which A is t. 

Notice that "A |= B" is a stronger statement than 
"Tf fA, then = B." For example, "If |= VII, then [= XI" 
(i.e., "If & (PiVP2) & (Pi —> Ps), then F Pi & Pg") is true, 
because "|= VII" is false. But "VII & XI" is not true. 

Now suppose there are given-axmdiome Aj , «+, Ant General- 
izing from the case m =1, we define: B is a valid consequence 
of Aas soe, AL (by the propositional calculus), or, in symbols, 
Ary sees An f&B if, upon constructing truth tables from the 
list PL, soe, Ph of prime formulas occurring in one or more of 


Aa, eves An? B B receives the value t at least for each 


’ 


Se ae 


Logic 61 


assignment which maxes all of Ay, ees, An simultaneously t. 


We give some examples to illustrate this definition. 


XIII XIV XV XVI XVII 
PQR. PSR @5R (POR)&(Q>5R) PVQ>R (Q5R)—>(PVQ >R) 
ttt t t t t t 
ete £f f f f t 
GF t t t t t t 
GLB Aig t £ f £ 
Ft t t t t t 
£ et t £ £ nia t 
£e°¢ t t t t t 
2 L t t t t t 


Inspection of the tables shows that, for example, XIII, 
XIV / XVI (Lines 1, 3, 5, 7, 8); XV XVI (Lines 1, 3, 5, 7, 8); 
XIII & XVII (Lines 1, 3, 5, 6, 7, 8). 

Theorem 6. (a) A_ EB if and only if -FA—bB. 


20. 
(ob) For m= 2: Ay, wos, Aon &B if and only 
if A, & ... & A. &B, and hence, by (a), if and only if 
oe) SHUM 22) See ee Se ee 
Ay & oe) BAL > B. 


Proof. (a) By the table for —>, A—>B receives t 
anyway for the assignments to Pj, +++, Ph waren make A f. 
For the remaining assignments, i.e., those whacky, t, it receives 
the value t exactly when B receives the value t. Thus 
- A—>B, i.e., A—>B will receive the value t for all as- 
signments, exactly if ne, i.e., B receives the value t 
for those assignments which make A take the value t. 
(b) By the table for &. (Using Theorem 2, with 25 of Theorem 4, 
it is immaterial how the parentheses are inserted into 


Ar & ... & A, for m> 2.) 


Logic 


62 


Corollary. m2 1: Ary ser Bea? Am FB» if and 
only if Aa, «++; Ae FE An —> B. Hence, Aq, ose, Anes? AL eB 
if and only if oF Ar —> (Az -—> (.-. (A, > B) +++ ))- 


We carry out the proof for m = 3, which contains all the 
essential ideas of the general case. From Aj, As, As EB, we 
get - (A, & Az) & As —> B by the theorem; whence, by 13 of 
Theorem 4 with the corollary to Theorem 5, F- A, & As —> (As —>B) 
finally, by the theorem, Ay, Az FAs —> B. Similarly in the 
reverse direction. 

Thus the problem of what formulas are valid consequences 
of others, to which this section is devoted, is reduced to 
the problem of what formulas are valid (are "tautologies"), which 
we considered in the preceding section. This is why tautologies 
are important. 

Although valid consequence is thus completely reduced to 
validity, in order to grasp the properties of the consequence re- 
lation it helps to think of these properties directly. Indeed, 


this is the easiest approach to some of the properties of ——> 


tabulated in Theorem 4. Note that in testing to see if 


Bag ones AW EB, just as in the case of validity, the list 
Pag aeng Pa can contain more prime formulas than only those which 
occur as prime components of any of Ai, ---; Aw? B. (Compare the 


discussion on pp. 54-55.) 


Theorem 7. (1) Aa, ---, AW FA, (1 =1, «++, m). 
(44) If Ai, «+5 A, F By (1 =1, ..., Pp), 
and Jf Bay «sea B. FC, then Ay, «ery An Cc. 
pase cals Pp 


Proof. (i) is an immediate consequence of the definition 


of |=. For (ii), by the remark preceding the theorem, we can 


se —— ie «| 


‘gsc. 63 
construct our tables from any list Py, ..., Pa containing all the 
prime components of Aj, «---+; Ane Bg sey Bo» Cc. Now consider 


any row for which Ay, «-«, An are all t. By the hypothesis 

Ar, oes AL By (AL = Te «axe Dy OTE Che! Bay weag By will also 
be t in this row. Thus, by the hypothesis Bi, ..-, By E Gy 

C will also be t in this row, as was to be proved. 

Theorem 7 gives an analysis of the process of inferring 
theorems from a set of axioms Aji, --.-, Aw the formulas written 
after f being those inferred. Thus we can make a list of in- 
ferred formulas in the manner of a proof in Euclid, beginning with 
Aa, +++, A, themselves by (i), and successively adding by (ii) any 
formula which we can infer from formulas Bz, ...,; B, already in 
the list. 

We can of course take p =O in (ii), giving: If FC, 
then Aas sees An C3 or, in words, any valid formula (tautology) 
can be put into the list. Another special case of (ii) occurs 
when Bi, .--, B, is a subset of Aj, ..., An (oy (1)). Thus any 
theorems which are consequences of some of the axioms are conse- 
quences of all. 

The following extension of Theorem 1 (p. 53) to the case of 
valid consequence is immediate (first apply Theorem 6, then Theo- 
rem 1, then Theorem 6 in the opposite direction): 

If Ei, -.., B, FE, then Ei, ..., En l= E*, where 

EL, eae Ee E* come from Ey, ..., Ew E respectively by substi- 
tuting Ai, «.-., A, simultaneously for the prime components 

Pips) 2 FS) Bs of the E's. 

The following strengthened forms of the Corollary to Theo- 
rem 2 and Theorem 3 (Theorems 8 and 9) are established by reason- 


ing entirely similar to that used in the proofs of the original 


Logic 64 
forms. Thus, for Theorem 8, instead of supposing (as we did in 
the proof of Corollary, Theorem 2) that A <—>B and Cy are t 
for all assignments,we consider just those assignments for which 
they are both +t, and conclude as before that for these A and 
B must have the same value, and hence that CR has the value t. 


Theorem 9 is established by a similar adaptation of the proof of 


Theorem 3. 

Theorem 8. Let Cy be a formula containing a specified 
occurrence of A, and let Cy be the result of replacing this 
occurrence of A by BB. Then A <=> B, C,_ F Cy. 


Theorem 9. A, A—>BEB FEB. 


Not only are all the formulas of Theorem 4 (p. 57) valid 

| consequences, by Theorem 7 (ii) for p = 0, of any assumptions 

Be, * &sy ALS but also those which are implications or equivalences 
(cf. 10a, 10b) lead via Theorems 6 and 3 to consequence relation- 
ships. For example, 4 with Theorem 5(a) gives~vA FA As 
another example, 3 gives: If A fB and A FB. then FwA. 
(For suppose A FB and A =~B. Then by Theorem 6, — ASB 
and FA >vB. moter ino applications of Theorem 5, -~ A. 


9. The propositional calculus (proof theory): provability and 


deducibility. 
The proof of theorems, or the deduction of consequences of 
assumptions in mathematics, proceeds a la Euclid (or as in our 
analysis following Theorem 7) by putting propositions in a list 
called a proof or deduction. (Proof is perhaps the better word 
when the axioms have a permanent status for the theory; deduction 
when we are not thinking of them as permanent for the theory under 


consideration.) The step from one proposition or formula i: the 


list to another is mediated by logic, as analyzed above for the 


{ Logic 65 
simple case that logic is the propositional logic only. Now the 
thought arises: Can't the proofs of logic themselves be given in 
the same manner? Then the steps will be mediated by applications 
of specified rules, instead of by somewhat flexibly applicable 
logical criteria. We shall find in this section that this can be 
done. 

We now give such a formulation of propositional calculus, 
both by itself and for its application to deductions from assump- 
tions. This formulation may be called "proof theory," and the 
formulation given in Secs. 7 and 8 "model theory.” Then at the 
end of this section we shall show that the two formulations give 
equivalent results. 

As axioms for the system of propositional calculus, we take 
any formula which has the form of any of the formulas occurring 
in 1-10b of Theorem 4 (p. 57). (These forms themselves may be 
called axiom schemata. Each axiom schema includes infinitely 
many axioms, one for each choice of the formulas occurring in the 
| axiom schema.) For example, corresponding to 1 of Theorem 4, we 
have as Axiom Schema 1: A —> (B—> A). Particular axioms by 
this schema are: P ——> (P —>P), P —> (Q—>P), 
vP —> (Q&R—>~P), P—> WR P)) > R> (P> &R>P))I, 
etc. 

As the sole rule of inference, called the rule of detach- 
ment or modus ponens, we take the operation of passing from two 
formulas of thacteras A and A-—-B to the formula B. (Cf. 
Theorems 3 and 9.) 


We define a proof (in the propositional calculus) to be a 


finite list of formulas B,, .--; By» such that each one is either 


an axiom of the propositional calculus, or comes from a pair 


66 


preceding it in the list by the rule of detachment. Such a proof 
is said to be a proof of its last formula By and this formula 


B, is said to be provable, or, in symbols, - Bp: 


L 
For example, the following is a proof of the formula A => A. 


(Here A can be any fixed formula.) 


1. AD (A> 4) Axiom Schema 1. 


2. ja> (a>a)} of{fa> (asa) > a > h >A 


3. |A> ((A> a) > all> [A -> al Rule of detachment, 1,2. 
4. A> (ADA) >A) Axiom Schema 1. 
5. ADA Rule of detachment, 4, 3. 


In 1, we have applied Axiom Schema 1 (p. 57) with A as 
the "A" and A->-— as the "B" of the axiom schema. In 2, A is 
the "A", A— A is the "B" and A is the "C" of Axiom Schema 2. 
In applying the rule of detachment to 1 and 2, the "A" of the 
rule is A— (A— A) (which is 1), and the "B" of the rule is 3. 

The fact that the proof of such an uncomplicated formula 
as A+ A is long stems from the fact that we have only one 
simple rule of inference. If we had occasion here to develop the 
proof theory in detail, we could state some theorems (called 
derived rules), asserting the existence of proofs under various 
conditions. 

A few remarks may help to make this way of treating the 
propositional calculus as an axiomatic-deductive theory seem 
reasonable. We have an infinite number of axioms for each axiom 
schema, as mentioned above. This could be avoided by requiring 
the language in which the prime formulas are constructed to include 
single letters as "proposition variables" and by adding a second 


rule of inference, the substitution rule, to say that E* can be 


Logie 67 


inferred from E under the circumstances of Theorem 1 (p. 53). 
This procedure seems less in keeping with usual mathematical lan- 
guage than the one we have followed (due to Von Neumann, 1927). 
The list of axiom schemata (thirteen in all) may seem sur- 
prisingly long. However, each symbol must have axioms to charac- 
terize it, i.e., to provide the deductive properties we want it to 


have. We have got along with only two or three axioms for each of 


—; —. & Vs, ~, namely, one or two (left column in 

Theorem 4) which help to prove formulas in which the symbol is 
(ight y 

used (1.e., to "introduce" the symbol), and one or which help 


us to infer formulas not containing the symbol (or not containing 
it so often) from formulas containing it (i.e., to "eliminate" the 
symbol). This, to our mind, elegant arrangement of axioms is due 
to Gentzen (1934). 

We could get along with only the first four axiom schemata 
by foregoing the use of the symbols &,V, <—> as an official 
part of the language, by considering, e.g., that each time we 
write A&B we are using an abbreviation for ~(A —>~B). There 
are still other possibilities.® 

For the propositional calculus applied to infer formulas 
from assumptions Ai, ..., Aw the formulas Ai, ..., An are in 
effect allowed to function as axioms also. However, we shall not 
call them axioms (for the propositional calculus), but, when we 
need a name, assumption formulas; and we shall not call Bi,---.By 
a proof, but a deduction from Aj, ..., Aue That is, a finite 


is a deduction (of By) from 


list of formulas By, ..., By 


®See Church, A., Introduction to Mathematical Legiec, Vol. I, 
Princeton University Press, Princeton, N.J., 1956, pp. 119, 136- 
158. This book contains a wealth of details covering the history 
and alternative approaches to logic, as well as very full exposi- 
tion and exercises. 


a 


if each formula in the list is one of Ay ..., A 


Ais «+e, A 
or one of the axioms (of the propositional calculus), or comes 
from earlier formulas in the list by the rule of detachment. If 
there is such a deduction, we say that By is deducible from 


hig ose g A,» and write: Aa, ---, A, LK By: 


Example: A&B is deducible from A, B; i.e., 
A, B -KA&B. The following is a deduction: 


1. A ist. assumption formula, 
2 B end. assumption formula. 
3. A—>(B— A&B) Axiom Schema 5, 

4. B—> A&B Rule of detachment, 1, 3. 
5. A&B Rule of detachment, 2, 4. 


Now we show that we have come out with the same results with 
this theory (proof theory) of propositional calculus as with the 
theory of Secs. 7 and 8 (model theory). 

Theorem 10. (The Deduction Theorem, Herbrand, 1930). 


For m2: If Ay, +++) A,4» A, EB, then 


Aas seen A b An > B. 


m-1 


We omit the proof of this theorem, which would take about 
two pages. It uses mathematical induction (see pp. 25-26) on the 
length £4 (> 1) of the deduction, supposed given, of B from 
Aas sees Ane” 


TThe proof of this theorem is naturally tied closely to the par- 
ticular selection of axiom schemata. Our book: Kleene, S.C. 
Introduction to Metamathematics, D. Van Nostrand Co., Inc. New 
York, 1952, pp. 90-91, gives the proof for the axiom schemata we 
are using here (apart from 4 being written instead of ~~, and 

<-> being written ~ and treated as a symbol of abbreviation) 


Fin Curate. : Nrbhe/,, Gurrmggen , sont North Holland Prblislnig G . gerr 


Logic 69 


Corollary. If Ai, -.-, A,B, then 
KA, —> (A2 — .-- (4, B)i wan da 


The proof consists in applying the theorem m times suc- 
cessively to Ay, e+, An eB. 


Theorem 11. IfFA, —> (Az —> ... (A, —  B) ...), then 


Ar, sees AL EB. 


The method of proof will be illustrated by writing out the 
case for m= 2. Assume that - A, —> (Az — B); consider a 
proof of this formula, say of length £. Then the following is a 


deduction of B from Ai, Aa: 


i The given proof of 

an ; A, —> (Az —> B). 

4. A, — (Ae — B) 

+1. Ay lst. assumption formula. 
£+2, Azs—B Rule of detachment, 2+ 1, &. 
+3. As 2nd. assumption formula. 

h+ 4. B Rule of detachment, £+3, £+2 


The Corollary to Theorem 10 and Theorem 11 accomplish the 
weduction of the deducibility notion Aj, .«..,; An +B to the 
provabiltity notion + A in a manner parallel to the reduction of 
the notion of valid consequence Aj, .«--, An FB to the notion 
of validity f= A given in the Corollary to Theorem 6. 

Hence, if we can show that + A and fF A are equivalent, 
we shall then have completed the demonstration of the equivalence 
of proof theory and model theory for the propositional calculus, 
both by itself and when applied under assumptions Aj, «++, Ant 
We do this in Theorems. 12 and 15. 


Logic 70 


Theorem 12. Every provable formula is valid, in symbols, 
if (A, then FA. 

Proof. By 1-10b of Theorem 4, each axiom of the propo- 
sitional calculus is valid. By Theorem 3, given that the 
premises for an application of the rule of inference are valid, 
so is the conclusion. Thus all formulas in a given proof of A 
are valid; in particular, A itself is valid. 

Corollary. For no formula A are both A and ~A prov- 
able, or in symbols, for no formula A do both |-A and -wA 
hold. (This corollary establishes the consistency of the propo- 
sitional calculus as an axiomatic-deductive system. ) 

Proof. Suppose - A and +rA for some A. Then, by the 
theorem, |= A and FA; i.e., A has all t's in its truth 
table, and likewise ~A. This is absurd, since by the table for 
w, the table for ~A must have t's and f's reversed from that 
for A. 

Theorem 13. Every valid formula is provable; or in sym- 

We omit the proof, which, though quite elementary, is 
rather long to give here, because it must include as a prelim- 
inary the demonstration with |} in place of = of a number of re- 
sults such as we collected in Theorem 4. In this preliminary 


part of the proof, Theorem 10 is a helpful tool.® 


This completes the proof of the equivalence of proof theory 


®Kleene, loc. cit.’ pp. 132-134, or Chapter I of Hilbert, D. 
and Ackermann, W., Principles of Mathematical Logic, Chelsea Pub- 
lishing Co., New York, 1950. This book of Hilbert and Ackermann 
is one of the best books on logic, being at the same time brief 
and clear. (Even better than the English edition cited, which 
is a translation of the 2nd German edition, is the 5rd German_ 
ot Grundztige der Theoretischen Logik, Springer, Berlin, 
1949. 


Logic pak 


10. 


and model theory for the propositional calculus. Natural as the 
development of the propositional calculus by truth tables (model 
theory) seems now, it was actually the more recent to be fully 
exploited, by EmilgPost, who first proved Theorems 12 and 13 in 
1921, and by Jan Lukasiewicz in 1921 (although some indications 
are given by Frege in 1879 and Peirce in 1885). Although an 
algebra of logic was initiated by Boole in 1847, the proof theory 
of the propositional calculus properly appeared with Frege's 
Begriffsschrift in 1879, and in Russell's work, especially in 
the Principia Mathematica of Whitehead and Russell in 1910-13.° 


The predicate calculus (model theory): validity 


In the propositional calculus, we studied those logical 
relationships which depend on how certain propositions are com- 
posed from other propositions by operations <—>, — &, Vv, nw 
in which the latter enter as unanalyzed wholes. In the predicate 
calculus we carry the analysis a step deeper, to take into account 
what in traditional logic is called the subject-predicate struc- 
ture, and we use two further operations V ("for all") and 
a ("for some" or "there exists") which depend on that structure. 

Consider the proposition "Socrates is a man." The part of 


"— 4s aman" or "x is a man" is a 


this proposition or sentence 
predicate; "Socrates" is a subject. Read "x is a man" using the 
mathematical notation of a variable, the predicate is seen to be 
a propositional function; i.e., for each value of x, it becomes a 
proposition, true for example when x is Socrates, false in 


Greek mythology when x is Chiron, and in the Kleene household 


when x is Fleck. To take another example, "John loves Jane" is 


®See Church, loc. cit.© pp. 155 ff. 


Logic 72 


a proposition, which can be thought of as a value of any one of 


three propositional functions "x loves Jane," "John loves y," 
"x loves y." In grammer, a value of x is a subject, of y an 


object; "x loves Jane" is a predicate, but not "John loves y." 
Mathematically, these distinctions are not important and we shall 
simply adopt the term predicate as short for the more cumbersome 
propositional function P(xi,-++.%)), for any number n>O of 
variables, and commonly object for a value of any one of the vari- 
ables. For n= 0, we have a proposition as a special case, for 
n=1 a property, for n=2,a (binary ) relation. 

This explains the name predicate calculus for the logic of 
propositional functions; a fully descriptive name would be calcu- 


lus_of propositional functions. The name functional calculus is 


also in common use; this seems to us a little unfortunate (though 
historical precedence can be claimed for it), as it leaves out the 
most descriptive part, that the functions are propositional, and 
thus invites confusion with functions from numbers to numbers (com 
monly called simply functions when propositional functions are 
being called predicates ) or even with functionals, i.e., functions 
from functions (from numbers to numbers) to numbers. 

Just as for the propositional calculus, we supposed given 
by some syntactical rules a non-empty set of prime propositions, we 
can suppose provided a non-empty set of prime predicates (includ- 
ing prime propositions as the case of predicates of zero varia- 
bles). The internal structure of these, or more properly of the 
sentences expressing them, shall be such that it does not get 
mixed up with the symbolism we shall use in constructing formulas 
from them (this we shall state more exactly later). A very easy 


way to secure this is to have a special symbol for each prime 


Logic tS 


predicate, e.g., X< Ys, X=¥. ete. The prime predicates we 


shall denote by letters P, Q, R, --- with letters u, Vv, W, X, 
y, 2, «+. for their variables, thus: P(x), P(x,y), Q(x,y,2), 
R, etc. Different capital letters P, Q, R, --. from the second 


half of the alphabet used together will represent different ones 
of the prime predicates (and of course P(x), P(x,y) necessarily 
represent different predicates, the first a 1-place and the second 
Mow 
a 2-place predicate). We s aliveal, the prime predicates simply 
cece Mal ane 
predicate letters, whic expressed 


simply as P(x), P(x,y), Q(x,y.z), R, etc., though we do not rule 
3 i 


out, WeSe are abbreviations for more complicated expressions 
in the presupposed language. 

We now give the definition of the class of propositions 
formed from the ee re of predicates, depending 
on how we are using them), whose logic we are to study; these 
propositions we shall call "formulas (for the predicate caileuiue)< 
The prime formulas shall be what we can get by any substitution 
of variables, not necessarily distinct, for the variables of the 
predicate letters. For example, from the predicate letter 
P(x,y,z), we have as prime formulas P(x,y,2), Ply.z.x)> P(x,x,9), 
P(u,u,u), etc. The formulas shall comprise the prime formulas and 
the additional formulas (composite formulas) obtainable by re- 
peated use of seven operators <—>, —>, &, Vv, ~J, Ws, @, thus: 
If A and B are formulas, so are A <—>B, A—>B, A& B, 

AVB,~ A. If A is a formula, and x is a variable, VxA 

(read "for all x, A") and gxA (read "for some x, A" or "(there ) 

exists (an) x (such that) A") are formulas. (In the literature 

(),,, hs ey may be used for vx; (Ex), Vv, = for ax.) 
vx is called a universal quantifier, and 4x an 


me SR 


Logic 74 


existential quantifier. They act as unary operators in building 
formulas, and with ou, other unary operator »~ are ranked weakest 
under the conventions for omitting parentheses. For example, 
WXAYB means (y¥xA4)VB, not Vx(AVB). 

In integral calculus, £ x®y dx is not a quantity that 
depends on x, though it does depend on y. Similarly ,> x /nt 
does not depend on n, though it does on x. We can wae this 
by saying that in the first example x is a bound variable, and 
y is free; in the second example, n is bound, and x is free. In 
3x + (rs dx, though perhaps the notation is bad (3x +f" t?y dt 
would be better), it is unambiguous, the first occurrence of x 
being free and the other two bound. 

Similarly, we have bound and free variables, or occurrence 
of variables, in the predicate calculus, where the two operators 


which bind variables are the quantifiers vx and ax (rather than 


Es 
Necsts ax or > ). Consider the formula 
n=0 


(1a) vx (P(x) & Ex@(x,z) —> ayR(x,y)) V Q(z,x). 


In the part axQ(x,z), the x is bound by the dx, which we can 
indicate by putting a subscript , to show that these two x's 
belong together. Similarly, we can indicate by subscripts 2 and <« 
the variable occurrencesbound by wy and yx, respectively. Note 
that since the x in Q(x,z) is already bound by ax, it is not 
free in the part P(x) & axQ(x,z) —> ZyR(x,y) which the yx oper- 
ates on, so the yx cannot bind it. In supplying the subscripts 
we therefore always work from the inside out, following the order 


of the steps by which the formula was built up from its prime com- 


ponents P(x), Q(x,z), R(x,y), Q(z,x). In this manner we obtain 


(1b) wxe(P(xe) & axi@(x1,2) —> ayaR(xs,¥e)) V Oz,x)- 


Logic 1S 


The variable occurrences not thus receiving subscripts (two of z 


and one of x) are free. As another example, consider 
(2a) Wy(P(y) & SxQ(x,2) —> SeR(y,z)) V Q(z,x). 
Supplying subscripts, we get 

(2b) Yyo(P(ys) & 9x1Q(x1,2) — EzaR(ys,22)) V Q(z,x)- 


Erasing the bound (occurrences of) variables in (1b) and (2b) 


gives for both 


(1c),(2e) Ve P( 9) & FQ 1,2) —> ZeR( a, 2)) VAlz.x), 


which illustrates that these two formulas are congruent. They 


are not congruent to 


(3) ¥e(P(2) & GxQ(x,z) —> eyR(z,y)) Vv Q(z,x) 


(4) Wx (P(x) & xa(x,2) —> ayr(x.y)) Vv alz.9) 
as the reader may verify. 

For the predicate calculus, as we are treating it, the 
variables xX,y,Z,... are each to range over the same set D, 
called the domain. (It is possible to allow two different domaing 
like real numbers for x,y,Z,-+-- and natural numbers for 
m,n,p,-.-3 then one has two-sorted predicate calculus, which we 
do not take up. Another form of predicate calculus treats the 
predicate letters as variables which may also be quantified; this 
gives a second-order predicate calculus, and on iteration higher- 
order predicate calculi. To distinguish the simple form of predi- 
cate calculus which we treat from these, it may be called 
restricted predicate calculus, or predicate calculus of first 


order. 


76 


Consider any predicate letter, for example, P(x,y). The 
classical predicate calculus (with which we are concerned) makes 
the assumption that, for each pair of values of x,y chosen from 
D, the resulting proposition P(x,y) (taken as the value of the 
predicate P(x,y)) is either true (t) or false (f), but not both. 


However, we are not told which is the case. Considering that thi 


fol 


happens for each x,y in D, it comes to the same thing to say 
that there is correlated to the predicate letter P(x,y) a func- 
tion (x,y) from DxD to {t.tt, i.e., a function 4(x,y) which 
for each x,y in D takes a value in (ese) « Such a function 
4(x,y) we call a logical function. For the case of a 

Bitsy snag x,) with n= 0, i.e., simply P, the logical function 
BRay weeny x,) is simply a t or f, as in the propositional 
calculus. 

The truth tables given for <—>, —>, & V ,-~ in the 
propositional calculus (p. 49) shall again apply. We also define 
now the process of evaluating VxA and 4xA. Occasion to evalu- 
ate these will arise only when we are in a position to evaluate A 
for each choice of a member of D as value of x at its free occur- 
rences in A, or, in brief, when we can evaluate A by a logical 
function of x. We define VxA to be true (t) if this logical 
function has t for all its values, otherwise f; and 3xA to be 
t if this logical function has at least one t among its values, 
otherwise f. 

Now can we compute a truth table for any formula? To begin 
with, D, though supposed fixed, is unknown. Actually, only the 
cardinal number D (still unknown) of D matters, For illustration, 
however, let us suppose D is a domain of two objects, which we 


write simply 1 and 2; i.e., D= fig fe Take the formula 


Logic 77 


P(y) V¥x(P(x) —> Q). To compute a truth value for this, we must 
start from an assignment consisting of a logical function of one 
variable over D as value of the predicate letter P(x), a truth 
value (or logical function of zero variables) as value of Q, and 
a member of D as value of the free variable y; i.e., we shall 
compute a table to be entered from these three quantities. Be- 
fore computing this table, we list the 4 (= 22) possible logical 


functions of one variable over D, as follows: 


ds(x)  ba(x 
1 t t £ 2 
2 t f t £ 


Here is the table for P(y)V¥x(P(x) —> Q) 


P(x) @ iy Ply) Vv Vx(P(x) -—> Q 
de fitx) t i | t 
2. Qilx) ¢ 2 t 
3. B(x) f£ t 
4. Qi(x) f£ 2 t 
5. go(x) t 2 % 
6. folx) +t 2 t 
ae Qo(x) f£ 1 t 
8. bo(x) £ 2 f 
9. ds(x) t 1 t 
10. fg(x) t 2 t 
Tie As(x) f£ 1 f 
12. ds(x) f£ 2 t 
13. fa(x) t 1 t 
14. fa(x) t 2 t 
1s. balx) f 1 t 
16. Ba(x) £ 2 6 


Here is the computation for the entry in Line 8 (explana- 


tion follows): 


P(y) v vx(P(x) —> Q) x | do(x) et 
i) ba(2) v Wx(fa(x) —> f) 1 © 
(in foy et 2 t 
| (444i f 


The first step is to substitute the assignment represented by 


Logic 78 


Line 8 into the formula to be compited; this gives (i). Before 
we can evaluate the part Vx(f2(x) —> f), we need to compute 
fo(x) —> f as a logical function of x; the result is shown in 
the supplementary table to the right, and here is the computation 
of Line 1 of it: 


ee — f 
gall) Sf 
t+ —-f by table for f(x). 
f by table for —> (p. 49). 


Continuing the main computation, since the supplementary table 
does not have all t's, Wx(42(x) —> f) 1s evaluated as f; also 
f2(2) is f by the table for fa(x); so we get (ii). Finally we get 
(441) by the table for V. 

This illustrates the definition of the table for a formula, 
for the given D. As before, shortcuts are possible. For example, 
the observation that A—>B is t whenever B is t_ shows 
that P(x) —> Q will have a supplementary table of all t's 
whenever Q is t, and hence by our prescription for evaluating 
vxA and finally the table for \/, we can write t in Lines 1, 2, 
5, 6, 9, 10, 13, 14 without further ado. 

It should be clear from this illustration that, provided a 
domain D has been selected and this domain is finite, a table can 
be computed for any given formula. Of course for large finite D, 
if no shortcuts are used, the computation may be of impractical 
length. If D is infinite, the table is no longer a finite ob- 
ject which we can actually compute; but what is meant by the table 
should be clear enough, and we may indeed be able to reason about 
it. 

When can a formilSybe said to be true on the basis of only 


the predicate calculus? Considering that both the D (or D), and 


Logic 


eS) 
the logical functions over D as values for the predicate letters 
in A (or truth values in the case of zero variables), and the mem- 
bers of D as values for the free variables of A, will all be un- 
available, the answer must be as follows: The formula A is true 
on’ the basis of the predicate calculus, exactly if, for each 
choice of Dy the resulting truth table has only t's in its value 
column. In this case we say A is valid (in the predicate calculus) 
and write — A. (In the present sections on the predicate calcu- 
lus, it will be understood that "valid" and "et" refer to the 
predicate calculus, unless the contrary is stated.) 

It is also often of interest to consider the predicate cal- 
culus supplemented by a choice of D or of D; we then say A is 
valid in the domain D, or write D-k A, to say that the truth 
table of A for the chosen D has all t's. (Interesting cases are 
D= kK, a positive integer, and D = Noe) 

There is a vast difference now from the situation we had in 
the propositional calculus. There each question as to the valid- 
ity of a formula A could be settled automatically by computing the 
truth table. Now the definition of validity refers to a whole 
family of truth tables, one for each D, and for an infinite D we 
cannot really compute the table. For validity, every one of these 
tables should give all t's. Despite this difficulty, we shall see 
that logical theory has gone quite far in solving problems of the 
predicate calculus. 

For a demonstration of non-validity, it suffices to find 
just one D and one line of the table for this D which gives f. 
Thus indeed we already know that P(y) Vv ¥x(P(x) —> @) is not 
valid, because we found an f in Line 8 of its table for D =m By 


We shall now show the formula P(y) —> axP(x) to be valid. 


80 


In doing so, we cannot help using some general reasoning, as we 
must show that we get all t's in the table for any D. However, to 
help ourselves picture the situation, we begin by taking 

D= {252,3}. The 1l-place logical functions are now 6 (=2°) in 


number, as follows: 


x Bi(x) Bo(x) fs(x) falx) ’s(x) fe(x) 27(x)  2a(x) 
1 t t t t f f f f 
2 t t iy nig t t fi ¢ 
3 t f t if" t f t £ 


The table for P(y) —> @xP(x) will have 24 (= 8°3) lines, 
since all 8 logical functions have to be listed under P(x) 
each with each of the 3 members of Das y. We show two lines 


as a sample. 


P(x) sy P(y) —> SxP(x) 
17 tox) 2 t 
22 fa(x) 1 t 


For Line 17, note that fe(x) has a t in its table, namely, for 
x = 2. Hence by the rule for evaluating Ux, the part UxP(x) 
takes the value t. This consideration suffices for the first 21 
lines of the table, in all of which the £(x) has a t in its 
table; i.e., it suffices for every £(x) except fa(x). For 

Line 22, on the other hand, fa(1) is f. so by the table for —>, 
the whole is t. This consideration suffices for the last 3 lines. 
in which, since f(x) has only f's in its table, le(y) and thus 
P(y) will be f whatever y is. It should be clear now that 
for any D even an infinite one, the table for P(y) - a@xP(x) 
will have all t's. The demonstration is given by classifying the 
assignments to P(x),y (or the "lines"). First, consider any as- 
signment with an £(x) as value for P(x) other than the logical 


function with all f's;, then @xP(x) is +t, so the whole is t. 


Logic 81 


Second, consider any line with the £(x) whose table is all f's 
as value for P(x); then P(y) is f whatever the value assigned to 
y;, so the whole is t. 

More examples of valid formulas will be provided in 
\ Theorem 18. However, first let us see what we can take over from 
our study of the propositional calculus (Sec. 7). 

There we reasoned that, if a truth table entered from 
values for parts not necessarily prime gives all t's, the truth 
table entered from values of the prime parts must likewise give 
all t's. This principle we stated as Theorem 1, p. 53. The same 
reasoning holds good now. For example, any formula of the form 
A(y) —> @xA(x) (to give the idea roughly) will be valid. For in 
computing any line of its table. for a given D. the computation 
will consist of two parts: first, the computation of a logical 
function as value of A(x). and second the computation thence of a 
t or f as value of the whole. The second part of the computa- 
tion will coincide with the computation of some line of the table 
for P(y) —> SxP(x); and, as we have seen, all lines of the table 
for P(y) —> GxP(x) have +t. This gives the idea, and one is not 
likely to go astray in applying it. To state it accurately as a 
substitution rule, a little care is necessary to spell out fully 
what we mean by "a formula of the form A(y) —> #xA(x)" or, which 
comes to the same thing, "a formula coming by a proper substitu- 
tion from P(y) —> axP(x)." 

To do so, first let us describe the mechanics of substi- 
tution for a predicate letter. For this purpose, let us think of 
the substitution in the example as being for P(w). For P(w) we 
are to substitute a formula, call it A(w). By A(r). for any vari- 


able r, we shall mean the result of substituting r for the free 


82 


oceurrencesof w in A(w). By the result of substituting A(w) for 
the predicate letter P(w) in P(y) —> axP(x), we shall mean the 
result of replacing the parts P(y), P(x) by A(y), A(x) respective- 
ly (y,x are successively the r). To illustrate, we show it for 


four choices of A(w)- 
I II 


A(w) : yzQ(w,z) vya(woy) 
A(y) —> Gxa(x): W2@Q(y.z) —> ExVzQ(x,z) VWyQ(y.y) —> ExvyQ(x,y) 


III Iv 
A(w) : Q(w,u) Q(w,x) 
A(y) —> Sxa(x): Q(y,u) —> FxQ(x,u) Q(y.x) —> 2xQ(x,x) 


Of these four substitutions, we shall regard only I and III (left 
column) as "proper"; in II and IV the computation of a line for 
A(y) —> @xA(x) will not always split into two parts so that the 
second part coincides with one of the computations for 
P(y) —> SxP(x). The trouble in II is that the free y in P(y) be- 
comes bound by the Vy of the A(w), and in IV that the free x in 
the A(w) becomes bound by the @x in the @xP(x). These mixups in 
the way the variables are bound after the substitution will be 
avoided by observing two restrictions. 

To formulate the substitution process and these restric- 
tions in general, say the substitution is of formulas 


A, (Was ++ ow, ) (4 = 1, ..., n) simultaneously for the predicate 


Py 
S ) in a formula E not containing any of 
i 


WaseesaWy . The substitution with result E" is effected by 
n 


replacing simultaneously each part of E of the form Py(tas ety ) 
i 


letters Py (Was-++oW, 


Logic 83 


by Ay(oapes sty 


tuting simultaneously fr,,...- 


), where A, (Pisseeoty ) is the result of substi- 
ds 


Py for the free occurrences of 
a 

respectively in Ay (Wass. sip ) (the variables 

nil 


are not’necessarily distinct). The substitution is 


Wyse oW 
Py 


Bag v ware 
‘ Py i 
proper if (A) none of the variables of E occurs bound in any 


A (Was eee sy ) and (B) none of the free variables of any 
i 


A, (wy,--.,W. ) occurs bound in E. 
1 Py 


Theorem 14. (Substitution Rule for the Predicate 


Calculus = Theorem 1 extended to the Predicate Calculus.) Let E 


be a formula containing as predicate letters only Py(wa.-+-.W, ) 
Bg 


F * 
(= Ly swag, Th) Be TEt E_ come from E by a proper substitution 


of formulas A,(Wa,...,w, ) simultaneously for P,(wi,---.w, )- 
a i Py cane i Ps 


* 
E, then FE. 


Theorem 2 and Corollary, and Theorem 3 (pp. 55-56) hold for 
the predicate calculus by, : "the same reasoning as for the 
propositional calculus. 

As an especially simple application of Theorem 14 (with all 
| Pi = O and no variables in E), any formula E valid in the proposi- 
tional calculus will be valid in the pvedicate calculus when its 
prime components Py, «.-, P, are, or have substituted for them, 
formulas in the present sense. In particular, Theorem 4 (ps, 879 
continues to hold. (Theorem 5 will be replaced by Theorem 19 be- 
low.) 

Two formulas which are congruent (p. 75) will have the same 
tables, for any given D. For the alphabetical differences in the 


| pound variables will make no difference in the process of comput- 


ing the tables. Hence, by Theorem 2 (as extended to the predicate 


Logic 84 


calculus): 

Theorem 15. If A and B are congruent, F A <—> B. 

In our definition of proper substitution, we made the re- 
strictions as simple as we could. As a result, they are stronger 
than they need to be. Now if from — E we have concluded that 
- E* by Theorem 14, we can continue by Theorem 15 and the corol- 
lary to Theorem 2 to obtain Ee for any formula Ey congruent 
to EY. In a fuller treatment of the predicate calculus, this pos- 
sibility could be included directly in Theorem 14 by a more com- 
plicated definition of proper substitution. In the following 
theorem, 41 illustrates what this procedure allows by substitution 
in the valid formula P(y) —> @xP(x), and 42 is obtained by demon- 
strating similarly that VxP(x) —> P(y) is valid. 


Theorem 16. Suppose x is any variable, and A(x) is any 


formula such that, when the variable r (not necessarily distinct 
from x) is substituted for the free occurrences of x in A(x) 
with result A(r), none of the resulting occurrences of r is 
bound. Then: 

41. & A(r) —> ea(x). 42. & Vxa(x) —> A(r). 


Corollary. If | VxA(x), then & A(x). 


Proof. Take x as the "r" in 42, and use also Theorem 3 


(as extended to the predicate calculus). 

Theorem 17. Let x be any variable, © be any formula not 
containing any free occurrence of x, and A(x) be any formula. 
Then: 

(a) If & C —> A(x), then & C —> YKA(x). 

(oe) If - A(x) —> 0, then & Fxa(x) —> c. 

Proof. (a) Suppose - GC —> A(x). We must show that 
& C —> VxA(x). Choose any D. For this D, consider any assign- 


Logic 85 


ment of logical functions and members of D to exactly the predicate 
letters and free variables of C —> VxA(x) (1.e., consider any 
line of the table to be entered for exactly these); call this the 
given assignment. (Since x does not occur free in C, this does 
not include an assignment of a member of D as value to x.) Case I 
For the given assignment, C is f. Then, by the table for —>, 
Cc — YxA(x) is t. Case II: For the given assignment, C is t. 
Then, for the given assignment, supplemented by any assignment to 
x, C is still t, and, since C —> A(x) is t (by the hypothesis 
=C- A(x)), A(x) is t. As this was for any assignment to x 
together with the given assignment, WxA(x) is t for the given as- 
signment by the rule for evaluating V. Hence, by the table for 
—, C— VxA(x) is t for the given assignment. (ob) By a 
similar analysis by cases, according to whether C is t or f 
fory# given assignment. 

Corollary. If & A(x), then & WxA(x). 

Proof. Suppose - A(x). By 1 of Theorem 4 (for the predi- 
cate calculus), - A(x) —> Q@v~P —_> A(x)). Thence by Theorem 3. 
— PvV~P —> A(x). Taking Pv»P as the C in (a) (it contains no 
variable), — PV» P —> vxA(x). By 17 of Theorem 4, FP\WvP. By 
Theorem 3,- Vxa(x). 

Theorem 18. Let x and y be any distinct variables, 
A(x), B %)s A(x,y be any formulas, and A any formula not con- 


taining any free occurrence of x. Then: 


Logic 86 
43, 


46. 


48. 


50. 


52. 


54. 


$6. 


= GxGyA(x,y) <—> GyBea(x,y). 44. = WeYyA(x,y) <—> wyYxa(x,y). 


45. = UxvyA(x,y) —> Vydxa(x,y). 


=~ Bxa(x) <—> Vx vA(x). 47. (= vWxA(x) <—> Gx vA(x). 

FE SxA(x) <—>> Vx v A(x). 49. |= WxA(x) <—>~dx ~A(x). 

 SxA(x) vy ExB(x) <—> Sl. j= Yxa(x) & ¥xB(x) <—> 
x(A(x)V B(x)). VEx}(A(x) & B(x)). 

& AV xB(x) <> 53. f= A & VxB(x) <> 
ax(AV B(x)). vx(A & B(x)). 

F Av VxB(x) <—> 55. = A & GxB(x) <> 
Yx(A v B(x). Sx(A & B(x)). 

 WxaA(x) VY ¥xB(x) —> S57. & Sx(A(x) & B(x)) —> 
vx(a(x)VB(x)). Hxa(x) & IxB(x). 


Unlike the case of the propositional calculus, these cannot 
be established simply by computation. By the substitutional rule 
(Theorem 14), it will suffice to establish them with predicate 
letters in place of the general formulas A(x), B(x), A(x,y), A3 
e-g., 46 will follow if~dxP(x) <—> Vx-~P(x) can be shown valid. 
The reader should not find it too difficult to convince himself 
of the validity by using appropriate case classifications similar 
to those used for P(y) —> axP(x). 

As an exercise which can be carried out simply by table 
computation for 5 = 2, the converses of the formulas of 45, 56, 
57 are not valid in the case of particular predicate letters; 
e.g., for 45, WyaxP(x,y) —> GxayP(x,y) is not valid. 

Theorem 19. Let E be a formula composed from 


Ki, eee, Kye ~Kas seosge Ky where Ki, ees, Kk, are prime formu- 


las, by using only &, wv, ¥.4@3 and let EI be the result of 


Logic 


ll. 


87 
interchanging & with v, y¥ with g, and each K not preceded by ~w 
with vKj, in E. Then  ~E <> gl. 


This is proved like Theorem 5 (p. 58), but now using also 
46 and 47. For example, by this theorem, 


box [( P(x) v ay ~a(x.y)) @ vy Ry) ] <> 
ax [@P(x) & vy a(x,y)) v By ~ Rly) ] 


"evaluation" of the 


Compare pp. 58-59 for a discussion of 
negation of general formulas which is also applicable in the pred- 
icate calculus. 


Theorem 20. To each formula E, there is a formula F with 


that FE << F. 


Say, for example, E is the first of the following formulas. 


ExP(x) ——> vxQ(x) 


~ ExP(x) Vo ¥xQ(x) 36 
Vx~vP(x) VY VxQ(x) 46 
Vx(V¥x~ P(x) v Q(x)) 54 
wx(Q(x) v ¥x ~P(x)) 22 
Wx(Q(x) V Vy~ P(y)) Theorem 15 
VxVy(Q(x) V~ Ply) 54 


A replacement based on the result cited at the right transforms 
each formula into the next; F is the last. The demonstration 
that FE<— F can be given by corresponding applications of 
the corollary of Theorem 2, starting with Ff E <-> E (cf. 19, 
Theorem 4); or directly by the idea used to infer the corollary 
from Theorem 2. 
The predicate calculus (model theory): valid consequence. 

In the propositional calculus, we have thought of each 


prime formula as standing for some particular proposition, the 


Logic 88 


same throughout all formulas of the propositional calculus. But 
what formula it is, and thus whether it is true (t) or false (f), 
is unknown or not to be taken into account in the propositional 
calculus; so for a formula A to be true only by propositional cal- 
culus, i.e., to be valid, must mean that A has t in every line 
of its truth table, and for B to be a valid consequence of A must 
mean that B has t in all lines for which A has t. 

In the predicate calculus, the like is the case for the 
predicate letters, except that we use logical functions instead 
of simply truth values. The corresponding interpretation of a 
free variable would be that it stands for a particular member of 
D, the same throughout the predicate calculus, though what member 
we are not told in the predicate calculus. However. in applica- 
tions of the predicate calculus, when the details of the underly- 
ing language and its interpretation are supplied, the members of 
D that the free variables represent will still be unspecified, and 
need not be the same in different formulas. 

For example, as an application of the predicate calculus, 
we might be told that P(x,y) is x =y, 1.e., P is equality. 

It will be so in every occurrence. The two formulas P(x,x) and 
P(x,y) —> P(y,x) would thus become x =x and x =y —y =x; 
but here the x and y are still unspecified, and the x does 
not need to stand for the same member of D in an application of 

x =X as in an application of x = y —> y = x (though the two 
x's within either formula must always stand simultaneously for the 
same member of D). In this case, x =x and x = ymype=x by 
themselves as axioms or theorems (not as parts of other formulas ) 
express that whatever member of D x is, x equals x, and 


independently whatever two members of D x and y are, if x 


Logic 89 
equals y then y equals x. They are then synonymous with 
wx(x =x) and yxVy(x = y —> y =x), respectively. 

The definition of validity is unaffected by this reinter- 
pretation, as is illustrated for a formula A(x) with one free 
variable x oi erolteries of Theorems 16 and 17. But it affects 
the notion of valid consequence, because several formulas 
Ags waver; AW B are involved, so there is the question whether a 
free variable in these formulas must represent the same member of 
D simultaneously in all the formulas, or can represent any member 
of D in each independently of the others. 

The story is really a familiar one from the difference be- 
‘tween an identical equation and a conditional equation in algebra. 
As examples of identical equations we have (1) x+ysytx 
and (2) (x+1)? = x? + 2x +1; as examples of conditional equa- 
tions, (3) y=x+1 and (4) x2 -2x-320. From (1) we 
have a right to infer 3+1=1+3 or (xtz) + 2x = 2x4 (x+z). 
From (4) we have no right to infer that 22 -2-2.320 or 
y® - 2y - 3 = 0, though we can infer (x-3)(x+1) = 0 and thence 
x =5Vx =-1. We say in (1) and (2) the variables have the 
generality interpretation, in (3) and (4) the conditional inter- 
pretation. In these examples, the symbolism includes somewhat 
more (e.g., =, +, -, 0, 1, 2, 3) than we have in the predicate 
calculus, but they illustrate the principle. 

Under the generality interpretation of x, from the assump- 
tion A(x) we are justified in inferring whatever follows from 
A(x) being true for all x; under the conditional interpretation, 
any conclusions we draw containing x should refer to the same 


x as in the assumption (though we won't for the predicate 


Logic 30 
calculus know what one).2° 

That the generality interpretation is, so to speak, the 
standard one is natural, since, e.g. in arithmetic, generally 
x+y =y +x will have a permanent status as an axiom or theorem 
while x? - 2x - 3 = 0 would be only a temporary assumption, as 
in connection with some particular problem which leads to this 
equation. 

A formula containing no free variables we call closed. For 
any formula A let its free variables in order of first free occur- 
rence D6 Xgy sexy Xx, By the closure of A we mean the formula 
Vxa4+-Vx,A, which of course is closed; and we write it briefly VA. 
(If A is closed, i.e., if n =O, VA is simply A.) 

In view of the fact that the generality interpretation is 
to apply to the free variables in assumption formulas (unless 
otherwise stated), we say that B is a valida consequence of 
Ars++ eA, (in the predicate calculus) and write Arse s AL FB, if 
for each domain D, B has the value t for each assignment for 
which WAr,+++,VA, simultaneously have the value t. (if 
Aree AL are closed, then the V's are redundant here.) 

Now we have, by the same reasoning as Theorem 6 and its 
Corollary: 


Theorem 21. (a) Al B, :f and only if t= VA — B (and if 


and only if VA kB). 


(b Arse AL = B if and only if 


VA, & ... & YAn & B, hence if and only iff WA, & ... & VAR — B, 
and indeed (cf. 51) if and only if W(Al & ... & An) — B. 


1°Further discussion (though in terms of proof theory rather 
than model theory) is in Kleene loc. cit.,” pp. 148-151, espe- 
cially beginning bottom p. 149. 


nn rr 


Corollary. Arseeer A> An EB if and only if 


F VAi —> (WAs —> ... (vA, — B)...). 


$ Theorems 7, 8, and 9 hold by slight modification of the 
former proofs, and Theorem 17 can be strengthened to the follow- 
ing: 


Theorem 22. Under the circumstances of Theorem 17 (note 


X must not occur free in C): 


(a) CG —> A(x) & C —> Vxa(x). 
(b) A(x) —> CB Uxa(x) —> c. 


Suppose we do wish to draw conclusions from assumptions in 
which variables have the conditional interpretation, as we do in 
"solving" the equation x® - 2x - 3 = 0, or when we say "Let x 
be a number such that x >0O. Then... ." All we need to do in 
this case is to modify both the definition of valid consequence 
and Theorem 21 and its Corollary by exempting the variables in 
question from the closure operation. If there are several as- 
sumption formulas, we lose nothing essential by arranging that 
the variables having the generality interpretation be different 
from those having the conditional interpretation; let Yaseen 


have the generality interpretation, Mi yiens Ry the conditional 


interpretation, and let V! denote closure with respect to 


Vareep only. We say B is a valid consequence of 
Arse AL holding xj,... Xq constant or Aj,...,A F B, 


‘g m Rae oXy 
if, for each D, Bis t for each assignment which makes 


Visas ites Via. simultaneously t. In fact, this modified no- 


tion of valid consequence is very useful in studying the predi- 


cate calculus. 


Logic 92 


To contrast the two notions, take for example an A 
which contains only one free variable x. The statement A ie B 
is stronger than Alt- B; i.e., if A Py B then AEB, but not 
in_general conversely. Proof directly from the definition in 
terms of truth tables, or thus: I. Suppose A Fe B. By Theorem 
21 (modified) K A —> B. But by Theorem 9, A, A~> BEB; so by 
Theorem 7, AEB. II. By 11 of Theorem 4, & WxP(x) —> VxP(x), 
so by Theorem 21, P(x) — VxP(x). But if we had P(x) Fy VxP(x), 
we would get by Theorem 21 (modified) & P(x) —> VxP(x), which is 
not the case as can be shown by computing the truth table for 
D = 2. This example of course shows the necessity of the V in 
Theorem 21, and also the necessity of excluding free x's from the 
C for Theorems 17(a) and 22(a) (since & P(x) —> P(x)). 

Continuing with the case where A contains only x free, 
AEB if and only if - VxA —> B, and thence if and only if 


fe WxA —> VxB; but Ae, B if and only if & Vx(A—> 8B). Proof. 


I. The first part is Theorem 21. Then to transform - YxA —>B 


into & VxA —> VxB, use the corollaries to Theorems 16 and 17 to 
get | Vx(¥xA —> B), and then apply 36, 54, and 36 (with Corollary 
Theorem 9). II. First use the modification of Theorem 21 for Fas 
and then apply the corollaries to Theorems 16 and 17. 

To contrast further the two notions, consider the case 
where A still contains only x free but B does not contain x 
free. However let us rewrite A and B as "A(x)" and "C" to match 


the notation in Theorems 17 and 22. Now A(x) C if and only if 


= vWxA(x) —> C; but A(x) 5, GC if and only if - &xa(x) Sc. 


Proof. To Theorem 21 for & and k,, this adds only that - A(x)—>C 
is equivalent to |= dxA(x) —> C when C does not contain x free. 
This can be shown by either of the following methods. Method 1: 


First use the corollaries to Theorems 16 and 17, and then apply 


Logic 93 


12. 


36, 22, 54, 46, 22, 36. Method 2. From & A(x) —> C_ we infer 
k HxA(x) —> C by Theorem 17(b). To proceed in the opposite di- 
rection, suppose that F HxA(x) —> C. By 41, —& A(x) —> Gxa(x). 
These with 12 and Theorem 3 give & A(x) —> C,—-whenee;—by—Theo- 


-pemn—21—for—-— A(x) G+ 

x = 
The predicate calculus (oroof theory ): provability and deduci- 
bility. 


As axioms for the predicate calculus, we add to those of 


the propositional calculus (given by Axiom Schemata 1-10b, p. 57) 
the formulas having the forms appearing in 41 and 42 (Theorem 1g, 
De 84), i.e., we add those forms as two new axiom schemata. 

As rules of inference, we have the rule of detachment 
(or —>-rule) (p. 65) of the propositional calculus, and two new 
rules: the V-rule allows us to pass from C —> A(x) to 
C —> VxA(x), and the S-rule allows us to pass from A(x) —> C to 
HxA(x) —> C, when (in each case) C does not contain x free 
(cf. Theorem 17, p. 84, and Theorem 22, p. 91). 

The definitions of proof (of A) (p. 65-66), of A is prova- 
ble or LA (p. 66), of deduction (of B) from A,,...,A (p. 67- 


68), and of B is deducible from Aj, ... Ay OP Aas sees A, EB 
— en 


(p. 68) are as before, except of course that the set of axioms is 
enlarged by the two new axiom schemata, and the inferences can be 
drawn by elther the —>-rule (a two-premise rule) or the V-rule 
or d-rule (each a one-premise rule). 

For example, for any formula A, the following is a deduc- 


tion of A from VxVyA: 


Logic 94 


1. YWxVya lst. assumption formula. 


2. WxVyA —> VyA Axiom Schema 42, with 


SX WyA as the "x", 


Hye, A(x)". 
3. VyA —>-rule, 1,2. 
: 4. VWyA—> aA Axiom Schema 42, with 
yy, as the My, Maa 
"a(x)". 
5. A —-rule, 3,4. 


Thus VxVy FA. 

It follows immediately from the form of the definition of 
Aas+++,A, - B that Theorem 7 (p. 62) holds with "=" replaced by 
"}". (This remark could appropriately have been made back in 
Sec. 9 where "deducible" was first defined.) 

To define deduction (of B) from Aj,..-,A_, holding 
Rayer eK constant, and B is deducible from A,,...,A. holding 


Eire ely B, the only change 


is to exclude x, .. Xp from playing the rdle of the x in ap- 


Xz,-+--,%. constant or Arse AL je 


plications of the Wrule or the d-rule, except applications which 
precede all occurrences of Arse cA, in the deduction. However, 


Xa +++ >%, may play the role of the "r" for applications of Axiom 


Schemata 41 and 42. 
Theorem 23 (a). (The Deduction Theorem, Herbrand, 1930). 
TP Arsee AL ARE B with all free variables of An held _con- 


stant, then Aryer Any io An — B. 


(o) If Aa,...,A,_.. A,B, then 


Arseees Any re VA, —> B. Similarly, using V'!, when some of the 


free variables of A,, ave held constant, WV! indicating closure with 
respect to the rest. 


We omit the proof of (a), as we did of the deduction theo- 


Logic 


95 


rem for the propositional calculus (Theorem 10, p- 68); the 


proof?* would require adding a page or two to the proof cited for 


| Theorem 10. 


| by Theorem 7 for F, Aq,..-+,A 


Given (a), (b) follows thus. 


| m-1° 
| Ars+++sA, VA, —> B. 
Corollary. If Arse AL FB, then 
f WA: —> (WA2 —> «+. (WA, —> B).+-)- 
some of the free variables of A1..+.,A, 
Theorem 24. 


ArseeoA, FB: 


Proof. 


closed axiom. 


1. 
2. 


Bess WRB sBSBST 
ttttee ths 
oreo Vve 
CAT SES 


+1kleene, loc. 


ables of Aj,... 


For example, suppose 


+11. 
+ le. 
+15. 
+14. 


Thus Ai, Aok B. 


Suppose Aj,-- 


Similar] 


are held constant. 


oA, EB. 


But VAR b AW? by repeated use of Axiom Schema 42 with the 
—-rule, as illustrated above, where we showed that VxVy + A. So 


VA, / B, whence by (a), 


using Yt, when 


re WA —> (Wag —> (VA, ==> B)..+), then 


AL are held constant. 


Then the following is a deduction of B from Aj,Aa. 


5 
\ 
ey —> (VxVyA2 — i] 


ri — (K — A:) 


K—- Ai 

K — YxA, 
K 

VxAa 

VxVyA2e —> B 
Ae 

K —> VyAe 

K —> vVxVyAe 
WxVyAo 

B 


i- VXA, —> (VxVyAo — B). 


Let K be any 


The given vroof of 
VxA, —> (VxVyA2 —>5) 


lst. assumpt.. formula 
Axiom Schema 1 


—-rule, 41, #2. 
V-rule, 443. 
An axiom. 


—-rule, £45, £+4. 
—>-rule, +6, 4. 

2nd. assumpt. formula 
(similar to £+2, £+3) 
V-rule,4 +10. 


V-rule, 411. 
—-rule, 45, 412. 
—-rule, #13, H7. 


If x is the only free variable of A, and x,y 


7 pp. 97-98. 


Logic 96 


Qutthe only ones of Az, this illustrates the first part of the theo- 
rem. If some variables other than x,y occur free in A, and Asp 
(besides only x in A, and x,y in Az). then since, outside the 
steps 1- 4 of the proof, the Y-rule is applied only to x and y 
(at Steps #44, J+11, #+12) and the q-rule not at all, we have 

| Aiy,A2 + B, holding those other variables constant, which illus- 

trates the second part of the theorem. 

We have now reduced the notion of deducibility 


(Ar.+ +A, + B) to that of provability (-K A) in a manner parallel 


to the reduction in Sec. 11 of Aas AR —B to FA. So now to 
show the equivalence of model theory and proof theory for the 
propesttionat. calculus, it remains to show that Ff A if and only 
ifPaA. 


Theorem 25. If LA, then & A. 

Proof, by the same method as Theorem 12 (p. 70) remember- 
ing that Theorems 3 and 4 (particularly 1-10b) still hold good, 
and using also Theorems 16 and 17. 

Corollary. For no formula A, do both + A and}+ A hold. 
(Consistency of the predicate calculus.) 

The corollary follows frcm the theorem in the same way as 
does the corollary to Theorem 12. However, the present corollary 
does not require the full force of the theorem, which refers to 
any D but it can be inferred from the Theorem with f replaced 
by 1 - —, i-e., using D = {a}. This was the way the consistency 
of the predicate calculus was first proved, in the lst edition 
(1928) of Hilbert and Ackermann. 

Our remaining objective appears as Corollary 1 to Theorem 
26, which indeed also implies its theorem. We state the theorem 


in the form that is most convenient for the proof. We say a 


| Logic 97 


formula A is satisfiable in the domain D. if, for the domain D, 
the truth table of A has at least one t, i.e., if Ais t for 
some assignment of logical functions over D to its predicate let- 
ters and of members of D to its free variables. 

: Theorem 26. (Gddel's comnleteness theorem for the predicate 
calculus, 1950). For each formula A in the predicate calculus, 


either / ~A, or A is satisfiable in the domain D of the natural 


numbers, 1.e., for D_= {0,1,2,...}. 

The proof?® of this theorem is not as elementary as the 
proofs of the deduction theorems (Theorems 10 and 23) and of the 
completeness of the propositional calculus (Theorem 13, p. 70), 
which we have also omitted. 

Corollary 1. If EA for D = {0,1,2,...}, then A. So if 
eA, then tA. 


t. or briefly 


Proof. Assume that — A for D = {OsRs2ve- +f 


o-fA. Then, for this D, A has a table of all t's, so “A 
has a table of all f's, so~“A is not satisfiable. Thus by the 
theorem (with~A as its "A") wvsA, s0 by Axiom Schema 4 and the 
—>-rule, - A. 

‘heorem 26 is not the most famous theorem of Godel (the mos 
famous is "Godel's Theorem" of 1931 which we shall meet in Chapter 
III), but a little consideration will show that it is a remarkable 
theorem, especially if we include it with its Corollary 2 (below) 
which had been obtained earlier by another method. 

First consider our net result (by Corollary 1 with Theorem 
25) that model theory and proof theory are equivalent for the 


predicate calculus. 
a ee en ea ee a 


12See Hilbert and Ackermann, loc. cit.,® pp. 95-101, or Kleene, 
loc. cit.,’ pp. 389 ff. 


Logic 98 


In model theory, to show that a formula A is valid by di- 
rect application of the definition of validity, we must resort to 
general reasoning. It is not enough simply to compute a truth 
table, since a table for every D must be taken into account, and 
since moreover the "table" cannot really be computed for an in- 
finite D. In fact, when D is infinite, the set of the logical 
functions over D is uncountable, e.g., when iD} = X5 the cardinal 
number of the set of the logical functions over D is 2 ®o, 

In proof theory, on the other hand, to show that a formula 
is provable by a direct application of the definition of prova- 
bility means constructing a proof of it. Once the proof is con- 
eepacted, to judge whether or not it really is a proof is an 
automatic process: we simply test each formula in turn to see if 
it is an axiom by one of our fifteen axiom schemata or if it is 
an immediate consequence of a preceding formula or formulas by 
one of our three rules of inference. In fact, if a formula is 
provable, a proof of it can, in principle, be found automatically 
in a finite number of steps. For the proofs in the predicate cal- 
culus (being finite sequences of formulas) can be enumerated, and 
thus, by trying each proof in order according to some enumeration, 
we shall eventually reach a proof of the formula in question (as- 
suming it to be provable). The notion of provability thus re- 
quires consideration of only a countably infinite set, namely, the 
set of proofs, in contrast to the uncountable infinities which 
enter into the notion of validity. 

To say that a proof can be found for any formula that is 
provable is not the same thing as saying that whether any formula 
is provable or not can be decided automatically. (This is a ques- 


tion we shall return to.) But this situation is in contrast to 


Logic 


99 


the situation in model theory, where demonstrations of non- 
validity, for some examples of formulas we encountered, could be 
given automatically by table construction (can they always?), 
while demonstrations of validity required some general reasoning 
which it was up to us to discover. 

So it is surprising that validity and provability, the 
former a highly transcendental notion and the latter relatively 
elementary, are equivalent. For the (classical) propositional 
calculus, it is not easy to say that proof theory has any great 
advantage over model theory, but in the predicate calculus there 
is a very great gain. We have established an axiomatic-deductive 
system, by applying which every formula which is in fact valid 
can be rigorously proved, if one accepts the proofs of Theorem 25 
and Corollary 1 of Theorem 26. Put another way, only the proof 
of Theorem 25 has to be accepted for demonstrations of the valid- 
ity of formulas; thereafter the entire procedure is an automatic 
verification that a proof obeys the rules of the axiomatic- 
deductive system. Then Corollary 1 to Theorem 26 assures us that 
by relying on these rules no valid formula will escape us; that 
is, all the valid formulas will be found by the process of con= 
structing proofs. Thus direct proofs of the validity of the 
formulas in 435-57 (p. 86) avel Sedinedek: aniline Corollary 1 
of Theorem 26 we can be sure that, having eaitaundoten giketion 4 
(for the predicate calculus, from Theorem 4 for the propositional 
calculus via a simple application of Theorem 14) and Theorems 16 
and 17, demonstrations of the validity of 43-57 can be given by 
what is in effect mere repetition of the proofs of Theorems3,4 , 16, 


and 17. (Ina fuller treatment of the subject, we would give 


Logic 100 


these proofs rather than ask the reader to confirm 45-57 directly 
from the definition of validity, for which we could give him no 
exact instructions.) 

Thus Gédel's completeness theorem gives us one thing for 
which we were seeking (the second sentence of Corollary 1). But 
in fact it gives us more than we sought (the first sentence). 
This is emphasized in a second corollary. 


Corollary 2. (a) If a formula A is valid in the one 


(briefly: if  - FA, then FA.) 


(b) (Lowenheim's Theorem, 1915; also called 


the Lowenheim-Skolem Theorem.) If a formula is satisfiable in 


some non-empty domain D, it is satisfiable in the domain 
i ere ae 

Proof. (a) Suppose \, - k A. Then by Corollary 1, A, 
whence by Theorem 25, — A. (b) Suppose A is satisfiable in D. 
Then not F~A (since for this D, A has some t in its table, so 
~A has some f), whence by Theorem 25 (contraposed) not -~“A, 
whence -by Theorem 26, A is satisfiable in ie) L @e wea} rs 

It may not seem remarkable at first that if a formula is 
satisfiable in any domain D, it is also satisfiable when D is the 
natural numberg. But we illustrate the force of this result by an 
application of the predicate calculus to axiomatic set theory, 
where one expects the domain D, which includes all the sets for 
the theory, to be uncountable. We choose the axioms for set the- 
ory used by Godel in 1940, with minor rearrangements in notation. 
(We do not use the axioms of Fraenkel's 1953 book which we cited 


on page 45, because we do not know how Fraenkel proposes to put 


Logic 2 101 


these axioms into a language for the predicate calculus. The re- 
marks to be made apply also to the axiomatic set-theory of von 
Neumann 1925, and of Bernays in publications beginning in 1937.) 
Godel's axioms are expressible in the symbolism of the predicate 
calculus, using just three prime predicates, x € y, x = y, and 
M(x) (read "x is a set"), besides which five axioms for the 
equality predicate x = y, which Godel regarded as part of the 
pre-supposed logic, should be added to his list. After taking 
closures of those of the axioms not already expressed as closed 
formulas, we have 21 closed formulas A,,-.--,Az, which are sup- 
posed to describe a domain of objects including all sets for 
Gédel's axiomatic set theory. Let A be A, &... & Aaa. Assum- 
ing that there actually is a system of objects D that satisfy 
these axioms of set theory, A is satisfied in this domain D with 


" 


"es e753" “k= ys” ana waxy" for the prime predicates, Hence, 
by Lowenheim's seseteaiaie sea 2(b) of Theorem 26), there is a 
countable domain a in which A is also satis- 
fied. 

We have no assurance that the logical function assigned to 
x = y in thus satisfying the axioms in D’ makes x =y true only 
when the same member of Dis assigned to x and y, i.e., that 
it gives = its usual meaning. (It will make x = y when the 
same member of Dis assigned to x and y, since Vx(x = x) is one 
of the five axioms for equality.) However, this can be arranged 
by the further step of identifying with each element x of p’all 
other elements y of D for which x =y is true; in this way we 


are led to a domain D", finite or countably infinite, in which 


(as can be seen by considering the form of the five equality 


102 


axioms?) all twenty-one axioms will be satisfied, with = now in 
its usual meaning (identity). Inspection of the sixteen axioms 
for sets rules out the case that D'lis finite. [To summarize, 

ug, mala 
there is thus a countably infinite domain D'fin which is satis- 
fied, with = in its usual meaning. Yet in the axiomatic set 
theory described by A Cantor's theorem, according to which the 
set of the subsets of the natural numbers (which is a set) is un- 
countable, holds. This is Skolem's "paradox" (1922). 

It is not a paradox in the sense of an outright contradic- 
tion, but a kind of anomaly. For is an explanation, namely, 
that the "enumerating" set of ordered pairs, which constitutes the 
1-1 correspondence between the natural numbers and the elements of 
the countable D' is not itself a set admitted in the axiomatic set 
theory. This of course means that any axiomatization must fail 
fully to capture the notions of set, set of subsets of a given set 
1-1 correspondence, and countable. These concepts must, if we 
give them a prior status, elude characterization by any set of a 
finite number of axioms in the symbolism of the (restricted) predi 
cate calculus. But on the other hand, the paradoxes of set theory 
make it hard to give them a prior status independent of axiom yee 
tems. This led Skolem to the view that the concepts of set theory 
have only a relative status, so that, e.g., a set which is uncount 
able in one axiomatization is countable in another. 

Of course, another explanation of Skolem's "paradox" would 
be that there is no system of objects D which satisfies the axioms 
of set theory. In this case, by Gddel's completeness theorem, VA 


must be provable in the predicate calculus, so that the system of 


— 


18cf. Kleene, loc. cit.,7 pp. 400-401. Hm f . 425-Y27 


Logic 103 


axiomatic set theory with the predicate calculus as the means of 
deduction would actually be inconsistent; that is, in this case 
there would have to be a "real" paradox within axiomatic set the- 
ory, though it hasn't been found yet. 

In Skolem's "paradox" as applied above to Godel's, von Neu- 
mann's, and Bernays's axiomatizations of set theory, we used the 
fact that in these systems the number of axioms (i.e., of extra- 
logical axioms) is finite. In Fraenkel's 1922 system of axiomatic 
set theory, and in Skolem's 1922 system, there are infinitely many 
axioms. Another example of an axiomatic theory expressed in the 
symbolism of the predicate calculus with infinitely many axioms is 
the arithmetic of the natural numbers (although it was only in 
1952 that it was proved by Ryll-Nardzewski in Poland that a finite 
number of axioms would not suffice). 

In our treatment of valid consequence (Aras +s An & B) and 
deducibility (Arse AR + B), we had only a finite number of as- 
sumption formulas, but the definitions go through for infinitely 
many with no other change than to replace the Ar. +A, by 
Ao;Ai,A2,--- . The infinitely many formulas are countably infi- 
nite of necessity, since the underlying language is supposed to 
have a finite or at most countably infinite alphabet (cf. Sec. 3 
for the appropriate discussion of countability). Now <——d 
"Ao, Ar, Az, «+» B" is then clearly equivalent to "for some 
m, Aoyee ALE B," since in a given deduction,which is a finite 
sequence of formulas, of B from Ay,Ai,A2,---, only a finite number 
of A's can be used; say that mis the greatest index of those used. 
It is not immediately clear that there is a similar reduction of 
"Ao, Ar, fa, ». — B" to “for some m, Ag, so» A, B." How- 


ever, let us see what the results are. 


Logic 104 


We say formulas Ao, Ar. Aa, «e+ are simultaneously satis- 
fiable in a domain D, if there is an assignment of logical func- 
tions over D to the predicate letters, and of members of D to the 


free variables, which makes all of Ay, Ai; Aas «++ simultane- 


ously t. A tal Caee 
Theorem 27. (Géddel's completeness theorem forf/infinitely 


many formulas, 1950.) Given a countably infinite class 


Ao, Ai, Ae, ++» of formulas of the predicate calculus, either, fo? 


some m, -~v(Ao & Ai & +++ & A), or all of Ao, Aa, Az, «++ are 


simultaneously satisfiable in {0, 1, 2, --.}. 


The proof requires only a slight modification of the proof 
of Theorem 26.2* 


Corollary 1. If Ao, Air, Ae ++: FB in the domain 


fo, 1, 2, »».}, then Ag, Ai, Ae, ++» EB. Hence if 


Ao, Ai, Az, --- FB, then Ao, Ai, Az, +++ EB. 

Proof. Say each of Ag; Ai; Aas eos is closed. Suppose 
Ao, Ar, Az, +». EB in D= fo, 1, 2, ...}- Then for each as- 
signment in D for which Ao, Ai, Ae, -.. are all t, B is t, which 


comes to the same thing as saying that for each assignment in D 
either one of Ay, Ar, Aa, e+e is f or B is t, l-e., that for 
each assignment in D, one of ~B, Ap, Ai, ‘Aa, --- is f, l.e., 
~B, Ago» Ai, Aa, «++ are not simultaneously satisfiable in D. 

So by the theorem, for some m,} o(*B& Ag & ... & An)» which 
transforms (and here we take as proven that + and F are equiv- 
alent as applied to finitely many formulas) by 23, 37 (p. 57) to 
FAg & «1+ & An —> B, and thence to Ag, «+s, An FB (similarly 
to Theorem QY, or by Theorem 94 with 13, p. 57). So 


14K1eene, loc. cit.,” p. 397. 


Logic 


105 


Ao» Ai, Aa, +++ B. Similarly when Ao, Ai, Aa, --- are not 
necessarily all closed, by taking closures (or if some variables 
are to be held constant, closures with respect to the other vari- 
ables). 


Corollary 2. (a) If Ao, Ar, Az, ++» FB in |0,1,2,..4/, 


then Ao, Ar, Aa, «+» FB. 
(b) (Léwenheim's Theorem as extended ‘by 


Skolem 1920 to infinitely many formulas.) If Ag, Ai, Ae, +> 


are simultaneously satisfiable in some non-empty domain, then they 


are simultaneously satisfiable in the domain {o, Ly. By ail ‘ 


Proof. (a) Assume Ag, Ai, Aas «- Py -— B. Then by 
Corollary 1, Ao, Ai, Az, --» &B, whence similarly to Theorem 25, 
Ao, Ai, Aa, «++ FB. 

(b) Suppose that Ao, Ai, Az, «+. are simultaneously 
satisfiable in D. Then for every m not FY~(Ao & ... & An) 
(since for this D, some line of the table has t for all of 
Ao» +++, Ay, and hence f for (Ag & oe & An)» whence by 
Theorem 25 (contraposed), not + ~(Ag & ... & An)» whence by 
Theorem 27, Ag, Air, Aa, -++ are simultaneously satisfiable in 
{o, 1, 2, veehe 

Corollary 2(b) shows that Skolem's "paradox" applies 
equally to systems of axiomatic set theory using a countable in- 
finity of axioms, like Fraenkel's system of 1922 or the Skolem 
system of 1922. 

Since Godel's original proof in 1930 of the completeness 
theorem (Theorems 26 and 27) and modifications of it by Hilbert 
and Ackermann 1958 and Hilbert and Bernays 1939, there have 
appeared proofs by Mostowski 1948, Henkin 1949, Rasiowa and 
Sikorski 1951, Rieger 1951, Abraham Robinson 1951, Beth 1951, 


Logic 106 


os 1955, Reichbach 1955, and others. Some of these recent 
proofs apply topology or algebra. Applications of logic to alge- 
bra via Godel's completeness theorem have been studied at length 
by Robinson 1951 and Tarski 1952.35 

. Returning to the problems of the predicate calculus itself, 
we have seen, by Corollary 2(a) to Theorem 26, that if a formula 
is not valid, its "table" for D = {0, Le By see} would reveal its 
non-validity. In each of the examples of non-valid formulas which 
we happened to notice in Sec. 10, the non-validity could be shown 
by truth-table computation for D = [122/- We now give an example 
of a formula whose non-validity can only be shown by the use of an 
infinite D. That is, this formula will be valid for every finite 
D > 0, but non-valid for D = No: Thus, quite contrary to what 
might have been expected in Sec. 10, we do not have any automatic 
method to show non-validity, though we now do to show validity. 

To make the idea of the example clear, we start with three 
| propositions satisfied by the order relation < for the natural 
numbers: 

(1) x<y & YSZ X<z wk oXK dy(x <y), 
but as is easily seen incapable of being satisfied by any choice 
of the order relation in a system of a finite number > 0 of ele- 
ments. Now we rewrite these propositions taking closures and the 


conjunction,and using P(x,y) instead of x < y: 


een 


15Robinson, Abraham, On the metamathematics of algebra, Studies 
| in logic and the foundations of mathematics, North Holland Pub- 
| lishing Co., Amsterdam, 1951; Tarski, Alfred, Arithmetical 
classes and types of algebraic systems, Colloquium Lectures 
| delivered at the 33rd Colloquium of the American Mathematical 
Society, 1952, to be published. 


An elementary book on logic which can be recommended is Tarski, 
A., Introduction to Logic, Oxford University Press,New York, 1941 


eee 


Logic 


107 


(2) vxvyv2(P(x,y ) & P(y,z) —> P(x,2)) & Wx ~vP(x,x) & WxdyP(x,y). 


Call this formula A, and consider vA. The properties mentioned 
for (1) are tantamount to saying that A is satisfiable for 
D =N% but not for any finite D > 0. It thence follows that ~A 
is*not valid for D = No» but is valid for each finite D> 0. 
This example is due to Hilbert and Bernays 1934. 

Although many features of the predicate calculus go back to 
Frege 1879, the first explicit formulation of it as an axiomatic- 
deductive system (proof theory) was by Hilbert and Ackermann 
1928.26 


163ee Church, loc. cit.,© pp. 288 ff. 


Chapter III 


MATHEMATICAL FOUNDATIONS 


13. Axiomatic thinking vs. intuitive thinking in mathematics 


It is true that both sets and logic can be considered as 
part of the "foundations" of mathematics, in the sense that most 
mathematical theories make basic use of both. In this chapter, 
however, we wish to go further, to inquire into the nature of 
mathematics and the scope of mathematical methods. 

The axiomatic-deductive method in mathematics is known to 
us from Euclid's Elements, although there is a tradition that 
eredits Pythagoras with the introduction of the method. By use of 
it, the body of geometrical knowledge was systematized. Euclid's 
axiomatic system may be described roughly thus: "definitions" of 
certain primitive terms, such as point, line, plane are given, 
which are intended to suggest to the reader what is meant by those 
terms; certain propositions concerning the primitive terms, felt 
to be acceptable as immediately true on the pasis of the proper- 


ties suggested by the definitions, are taken as axioms or postu- 


lates; then other terms are defined in terms of the primitive 
ones, and other propositions, called theorems, are deduced by 
logic from the axioms. 

Axiomatics such as Euclid's, in which meaning is given to 
the primitive terms from the outset, is called material axiomat- 
ics. The discovery of non-Euclidean geometry by Lobatchevsky 
1829 and Bolyai 1832, however, showed that the meanings of the 
primitive terms in terms of physical space do not enable one to 
decide whether the famous "Parallel Postulate" of Euclid, or a 
postulate contradictory to it, is true; the differences in the 


resulting geometries may be too small to show up in any measure- 


en  d 


Mathematical Foundations 109 


ments we can make in the portion of space accessible to us. So 
for a proposition in Euclidean geometry to be exactly true must 

be a property of the geometry as a logical system. Now if Euclid- 
ean geometry is a valid logical structure, so is the Lobatchev- 
skian geometry; for as Felix Klein pointed out in 1871, the axioms 
of the plane Lobatchevskian geometry are all true when the primi- 
tive terms in them are reinterpreted so that "plane" is taken to 
mean the interior of &@ cirele in the Euclidean plane, "point" 
means a point inside this circle, "line" means a chord of this 
circle, and distance and angles are computed by formulas due to 
Cayley 1859. (Another such Euclidean model, applicable to a 
pounded portion of the non-Euclidean plane, was given in 1868 by 
Beltrami, who reinterpreted line segments as segments of geodesics 
on a surface of constant negative curvature. ) 

In these models observe that something new has been done 
with the axioms, not to be found in the earlier axiomatic think- 
ing: the meanings of the primitive terms have been varied, hold- 
ing the deductive structure of the theory fixed. Thus formal 
axiomatics arose, in which the meanings of the primitive terms, 
instead of being specified in advance, are left unspecified for 
the deductions of the theorems from the axioms. One is then free 
to choose the meanings of the primitive terms in any way that 
makes the axioms true. This proves especially fruitful in such 
cases as abstract group theory, where the results deduced from = 
the axioms with the set of elements and multiplication operation 
unspecified constitute a body of theory ready-made for diverse 
applications which result by different choices of the set and 
multiplication operation, such as groups of transformations, 


groups of automorphisms, etc. 


Mathematical Foundations 110 


The system may then be investigated for such properties as 
the independence of one axiom from the others (by seeking an in- 
terpretation of the primitive terms which makes that axiom false 
and the others true), categoricity (1.e., that any two interpre- 
tations can be put into 1-1 correspondence preserving all proper- 
ties), ete.+7 

In this approach to axiomatics the questions arise: Why 
do we choose the axioms we do, and why should the resulting sys- 
tems be of interest to us? The answer is suggested above by the 
remark that we may apply the resulting theory to systems of ob- 
jects provided from outside the axioms by an interpretation of the 
primitive terms, and indeed in some cases many different interpre- 
tations are possible (the axioms may then be called ambiguous ) as 
in the case of the axioms for abstract groups. Indeed we should 
not wish to employ a system of axioms satisfied under no interpre- 
tation; such a system we call vacuous. One of the problems in 
formal axiomatics is to show axiom systems non-vacuous. However, 
a system of objects used as an interpretation is often drawn from 
some other axiomatic theory; then we have a regress, which merely 
brings us to the question of the significance of that axiomatic 
theory instead. If at no stage 1s an application made outside of 
formal axiomatics, the whole activity must appear to be futile. 

We therefore conclude that if we are not to adopt a mathematical 

| nihilism, we must grant that formally axiomatized mathematics must 
not be the whole of mathematics. At some place there must be 
meaning, truth and falsity. At the very least, when we say that 
in a given axiomatic theory a certain proposition is a theorem, we 


En 


Vv 

17 this kind of treatment of axiomatic systems 1s well presented in 
Young, J.W., Lectures on fundamental concepts of algebra and ge-., 
ometry, Macmillan, New York, 1911. 


Mathematical Foundations alae 


must believe this is true, i.e., that it does follow from the 
axioms, though whether the proposition itself is true is being 
left out of account since for formal axiomatics we are taking the 
primitive terms as unspecified-er meaningless. 

, As a further illustration of a mathematical proposition 
which is not intended to be asserted merely as a formal but mean- 
ingless consequence of axioms, consider the statement that, given 
integers a, b, c, we can find out whether or not integers x and 
y exist such that ax + by +c¢ = O, i.e., the proposition that 
there is a method of determining whether or not ax + by +¢= 0 
(a; Ds ¢ integers) is solvable in integers. Although the theory 
of the integers may have been established axiomatically, this 
proof is intended to mean that we could discover whether or not 
there are solutions. A student who could merely reproduce the 
deduction from the axioms of the proposition that one can find out 
whether there are solutions or not, but could not do a problem in 
which he found out, would not have acquired what the teacher in- 
tended to teach. Nevertheless, he would be doing all that should 
be asked of him if the theorem (that one can find out) were in- 
tended only in the sense of formal axiomatics. 

As the least drastic method of meeting the situation posed 
by the paradoxes, we described axiomatic set theory in Sec. 6. 
Here the axiomatics is to be understood in the formal sense, un- 
less one is to try to retain an intuitive conception of sets, 
which it was presumed was exactly what the axioms were to sup- 
plant. However the present considerations show that the resort 
to a formal axiomatic theory, though it may offer considerable 
advantages, leaves open such problems as why the axioms are sig- 


nificant and whether or not they apply to any system of objects 


rr 


Mathematical Foundations 112 


not merely similarly postulated as existing for some other axio- 
matic theory. 

Hilbert, whose Foundations of Geometry, 1899, was a classis 
of the formal axiomatic method, undertook to deal with these 
problems. He admitted that classical mathematics contains much 
that goes beyond what is clearly meaningful and justifiable on 
intuitive grounds, as indeed mathematicians generally were made 
to realize when in set theory they went too far and encountered 
paradoxes. But he proposed to save classical mathematics (short 
of the paradoxes) by a program which we can roughly describe as 
follows. Classical mathematics should be formulated as a formal 
axiomatic theory, and then the theory should be shown to be con- 
sistent, i.e., free from contradiction. 

Before this proposal of Hilbert, first made in 1904, but 
not seriously undertaken by him and his co-workers until after 
1920, consistency proofs had been given for formal axiomatic the- 
ories by means of a model or interpretation, in which the axioms 
are found to be all true when the primitive terms are interpreted 
in terms of another theory. We saw an example of this above, by 
which the non-Euclidean plane geometry of Lobatchevsky is shown 
to be consistent if Euclidean geometry is consistent. In each 
case, a proof of consistency by a model only shows one theory 
consistent if another is. By Descartes's method of analytic 
geometry, the consistency of geometries generally is reduced to 
that of the theory of real numbers, i.e., analysis. But how is 
one to establish the consistency of analysis? Certainly not by 
using a geometrical model; this would be a vicious circle. Nor, 
according to Hilbert and Bernays, by appeal to the physical 


world. For limitations of our measurements in the physical worlc 


Mathematical Foundations 113 


prevent us from saying that a continuum is actually given by expe- 
rience; rather it is an idea we obtain by extrapolating or ideal- 
izing what is actually given.+® 

So Hilbert's proposal to prove classical mathematics as em- 
bodied in a formal system consistent called for a new method in 
place of the method of giving a model. This method consists in a 
direct application of the idea of consistency, namely, that there 
be no contradiction or paradox consisting of two theorems, one of 
which is the negation of the other. Hilbert proposed to show that 
this cannot happen, by making the proofs in the axiomatic theory 
the object of a mathematical investigation, called metamathematics 
Of course, such a demonstration of consistency is in a sense rela- 
tive to the methods used in the metamathematics. Hilbert there- 
fore aimed to use in the metamathematics only methods, which we 
call "finitary," that are intuitively convincing. Specifically, 
these methods should avoid use of the "actual infinite." Hilbert's 
new approach avoids the completed infinite in the statement of 
the problem of proving consistency, since the proofs in a theory, 
are only countably infinite, while what the theory is supposed to 
be about may be much less elementary. So it seemed not implausi- 
ble to hope that it might be solved by finitary methods. 

We shall take a closer look at how classical mathematics is 
to be made into a formal axiomatic theory and studied in meta- 
mathematics in Sec. 14. Meanwhile let us consider further the 
import of the proposal for such a consistency proof. 

Brouwer, who was the champion of intuitive thinking in 


mathematics as Hilbert was of axiomatic thinking, argued that even 
a 


18yilbert, D. and Bernays, P., Grundlagen der Mathematik, Vol. — 
I, Springer, Berlin, 1934, pp. 15-17. See also Kleene, loc. cit. 


Bp. 54-55. 


Mathematical Foundations - 114 


if Hilbert should succeed in giving a consistency proof tor clas- 
sical mathematics, that would not make classical mathematics cor- 
rect. Thus he wrote in 19235, "An incorrect theory which is not 
stopped by contradiction is none the less incorrect, just as a 
criminal policy unchecked by a reprimanding court is none the less 
criminal." To which Hilbert replied in 1928, "To take the law of 
the excluded middle away from the mathematician would be like de- 
nying the astronomer the telescope or the boxer the use of his 


fists." This controversy between the "formalists," 


represented 
by Hilbert, and the "intuitionists," represented by Brouwer, led 
eventually to the agreement by the intuitionists that Hilbert's 
program would be unobjectionable if and only if the formalists 
refrain from taking a consistency proof as justification for at- 
taching a real meaning to those parts of mathematics which the 
intuitionists reject as having no intuitive basis. 

The problem for formalist mathematics then becomes: how, 
after admitting that classical mathematics goes beyond intuitive 
evidence, is value to be claimed for it? In discussing this prob- 
lem, Hilbert drew a distinction between real _ statements, which 
have an intuitive meaning, and ideal statements (involving the 
completed infinite) which do not. It is a common device in modern 
mathematics to adjoin "ideal elements" to a prebesely constituted 
system in order to achieve theoretical objectives, such as to 
simplify the theorems, comprehend them under a more unified view- 
point, etc. An example is the adjunction of the line at infinity 
to the plane in projective geometry. whereby the exception for 
parallel lines to the incidence relations between lines and points 
is removed. Hilbert argues that just this kind of theoretical 


gain is achieved by adjoining the ideal statements to the real 


Mathematical Foundations 115 


statements in classical wathematics, it 1s through this procedure 
that classical mathematics achieves its power and elegance. 

In this way, mathematics becomes a theoretical construction 
in which, Hilbert says, it should not be expected that each sepa- 
rate statement should have a real meaning, any more than that each 
proposition in a system of theoretical physics should be capable 
of immediate experimental verification - in the latter case, 1t is 
the theory as a whole that is tested against reality. 

A concrete example of the theoretical gain obtained by 
going through ideal statements in the process of proving real 
statements is provided by analytic number theory, in which theo- 
rems about integers are proved via the theory of real or complex 
numbers. Many propositions of elementary number theory have been 
proved thus, which either we don't know how to prove or can es- 
tablish only by much more complicated proofs, if only non-analytic 
methods are used. 

Closely related to this defense of classical mathematics as 
a simple and elegant systematizing scheme is the defense provided 
by its success in applications to the theoretical sciences, espe- 
cially physics. This led Weyl 1926 to pronounce Hilbert correct 
when mathematics is merged with physics in the process of theoret- 
4cal world construction, while he sided with Brouwer in restrict- 
ing himself to intuitive truths when mathematics is pursued for 


itself alone. ?° 


nc 


18Ssee Secs. 9-10 pp. 50-62 of Weyl's 1949 book cited in foot- 
note *. 


ee SSS... 


Mathematical Foundations 116 


14. 


Formai_ systems , metamathematics 


In the last section, we discussed formal axiomatics, 
stressing the point that the primitive terms are to be treated as 
meaningless for the purpose of deductions from the axioms by 
logics i.e., either they have been assigned no meanings, or the 
meanings they have been assigned have to be left out of account. 
If they had meanings which had to be taken into account, this 
would amount to saying that the theorems depend not only on those 
properties of the primitive terms expressed by the axioms, but 
also on the further properties which entered through the use of 
the meanings. But then those further properties should be stated 
as additional axioms. 

Now in formal axiomatics, while the primitive terms are to 
be meaningless, in carrying out the deductions by logic the mean- 
ings of the ordinary words are used. However we have seen that 
theories may differ in their logic as well as in their mathemati- 
cal assumptions, and to make it perfectly explicit what the theo- 
rems of a theory are to be, one should carry out the step for all 
the words which is carred out in formal axiomatics for the primi- 
tive terms; i.e., we should divest them of meaning for the purpose 
of deductions, and carry out the deductions entirely by stated 
rules applying only to the form of the sentences. The logic used 
in the deductions in formal axiomatics must be represented in part 
by these rules but may in part also be provided by logical axioms. 

To carry out such a complete formalization would not be 
practicable if the theory were kept in an ordinary language such 
as English. For the word languages have irregularities and am- 
biguities which would greatly complicate the task. 


Now indeed mathematics has in the entire modern period 


Mathematical Foundations 117 


profited greatly by use of special symbolism, though it has cus- 
tomarily left a part of the sentences, including the parts in- 
volved in logical deduction, in ordinary language. The symbolic 
equation not only represents a great economy in writing, but pre- 
wente an opportunity for manipulations (such as transposing, e-&-, 
x+5=2 giving x =2-5 or x= -3), which, though justified 
by the meanings, are in practice usually carried out speedily 
without stopping to think through the justification. This is in- 
deed a semi-formal kind of reasoning, which greatly increases the 
power of modern mathematics. 

The complete formalization which we now desire, for Hil- 
pert's and other purposes, is obtained by combining the symboliza- 
tion prevalent in modern mathematics with the symbolic treatment 
of logic available from the work of Boole, Frege, Whitehead and 
Russell, to mention only a few. In short, we construct a com- 
pletely symbolic language, using these ingredients, for the theory 
we wish to formalize. The result we call a formal system or 
logistic system or object language. This method of making a the- 
ory explicit is often called the logistic method. 

To discuss it, which includes defining its syntax, specify- 
ing its axioms and rules of inference, and studying the result, 
we operate in another language called the metalanguage or syntax 
language. The study of the formal system, carried out in the 
metalanguage as a part of informal mathematics, we cali metamathe- 
matics. 

For the metalanguage we use ordinary English and operate 
informally, i.e-, on the basis of meanings rather than by formal 
rules (which would require a metametalanguage for their statement 


and use). Since in the metamathematics English is being applied 


Mathematical Foundations 118 


to the discussion only of the symbols, sequences of symbols, etc., 
of the object language, which constitute a relatively tangible 
subject matter, it should be free in this context from the kind 
of lack of clarity that was one of the reasons for formalizing. 

Since the formal system results by formalizing portions of 
existing informal or semi-formal mathematics, its symbols will 
have an interpretation consisting of that informal or semi-formal 
mathematics. If we were not aware of this, the formal system 
would be devoid of interest. Only, the metamathematics, to accom- 
plish its purposes, must study the formal system as just itself, 
i.e., as simply a system of meaningless symbols, and may not take 
into account its interpretation. When we speak of the interpre- 
tation, we are not doing metamathematics. 

Furthermore, as we saw in Sec. 13, for Hilbert's program 
the methods used in the metamathematics must be ones which we 
feel safe about, called "finitary" methods. (Some writers use 
"“neta-" for any study of a formal system or object language from 
another language, whether by finitary methods or not. We shali 
avoid this usage.) 

After supplying a definition of formula, which was presup- 
posed in Chapter II for the propositional calculus and again for 
the predicate calculus, those calculi constitute (that is, when 
we also leave meanings out of account) formal systems. When this 
is done, the theorems given there constitute metamathematical 
theorems, except those involving the validity or valid conse- 
quence notion F- for the predicate calculus, whose definition is 
not finitary. (However Corollary Theorem 25, p. 96, is meta- 
mathematical, when proved using the notion of 1+ which is of 


course finitary, although Theorem 25 is not finitary - ) 


Mathematical Foundations 119 


We now describe a specific formal system Ni which is de- 
signed to formalize elementary number theory. We first introduce 
the formal symbols, which structurally play the part of letters 
of the alphabet in our formal language (although for the interpre- 
tation most of them will correspond to words of English). These 
symbols are: (.), <->. —>, & 5 o%s Ws BH, =) t2°> *, O,a, 
L, ©, «++ . The commas, and the dots at the end, are not 
formal symbols but punctuation marks used in displaying the 
formal symbols on this page. The symbols a; L, Us, «os are 
called variables; we need however to have a countably infinite 
number of variables available potentially. Since the Latin alpha- 
pet has only 26 letters, for definiteness we shall assume that 
the variables consist of the 26 Latin letters in script, and also 
any of these followed by one or more occurrences of another sym- 
bol; , so that for example W, G,5 G2 Qwui » £ a dey; Gy ¥ 
etc., are variables. The variables after the first 26 are then 
not single formal symbols, but finite sequences of formal symbols 
Then Nz has an alphabet of exactly 41 formal symbols. 

We call a finite sequence of formal symbols a formal ex- 
pression. Just as formal symbols correspond structurally to let- 
ters in ordinary language, formal expressions will correspond 
structurally to words, though for the interpretation some of them 
will represent entire sentences. Most formal expressions, like 
((aO = and aaa will be without interest to us. But we shall 
now define two particular classes of significant formal expres- 


sions: terms, which for the interpretation correspond to nouns, 


and formulas, which for the interpretation correspond to sen- 


tences. The definition in each case consists of several clauses. 


ll CO LL SSS Q03:°3°>5 


Mathematical Foundations 120 


Term: 1. Ois aterm 2. 0, bet, we ae terms 
3. If r and s are terms, so are (r)', (r) + (s), and 
(r)*(s).g 4- The only terms are those given by dt = Bi. 

Here “"r" and "s" are not formal symbols, but metamathemati- 
cal variables used in the syntax language to represent formal ex- 
pressions, in this case any terms already constructed. Thus 
(rt) + (s), for instance, is not a formal expression, but a meta- 
mathematical expression which becomes a formal expression when Tr 
and s are replaced by terms. 

Examples of terms: 0, a , Is, xy (O)', ((0)')#(au ), 
(((0)")#(a ))*(Ce ). 

Formula: 1. If r and s are terms, then (r) = (s) is 
a formula. 2. If A and B are formulas, so are (A) <—> (B), 
(aA) —> (B), (A) & (BY). (A) v(B) and~(A) 3. If A isa 
formula and x is a variable, then Wx(A) and ax(A) are formulas. 
4. The only formulas are those given by 1-3. 

As was the case for "r" and "s" in the definition of term, 
"a" and "B" are metamathematical variables representing any for- 
mulas, and "x" a metamathematical variable representing any formal 
variable. Thus Wx(A) becomes a formula when x is replaced by 
any variable, e.g., da, and A is replaced by any formula, e.g-, 
(aw) = (4), giving va((«e) = (&)). Using, e-g., 4 instead we 
get a different formula V L ((a.) = (4e)). This is why the meta- 
mathematical variable "x" was necessary in Clause 5: had we 
written "gq" instead we would only be allowing ¥« (( a) = (&)) 
put not Vi ((a) = (&)) as a formula. 

If a formal expression is given, how does one determine 
whether or not it is a formula? Let us consider the following 


example: 


Mathematical Foundations 121 
(1) (@e((((e)") t+ (a)) = CE) D(H (we) = (&))) 


We first observe that each of t+, a, & (the variable symbols oc- 


curring without parentheses) are terms. We then proceed outwards, 
so to speak, using the parentheses as a guide, obtaining succes- 
sively that (<)', ((c)') +(@) are terms, and then that 
(((e)') + (a)) = 0%), BeC(((e)') + (a)) = CF), 

(a) =(6), ~((a) = (&)), and finally (1) are formulas. In 
practice in the case of quite long formal expressions, prior to 
thus testing to see whether or not (1) is a formula, we may pair 
parentheses in the following way: First pair a left parenthesis 
"(" and right parenthesis ")" which occur, the left parenthesis 
to the left of the right one, with no other parenthesis between; 
this pairing may be indicated by attaching subscripts ,. ‘Then 
repeat the process using subscripts 2, then 3, etc., each time 
taking into account only parentheses not already subscripted. The 


result of carrying out this process on (1) is shown thus: 


(aad c(rolele(2 c)idetla & )ele=(ab)s)ao)ar —>(g4 (740. da bas (sb) s) re 


Now by following the ordering given by the subscripts, we can 
carry out the verification stage by stage that (1) is a formula. 
It can be proved by induction that if 2n parentheses in linear 
order, n of them left and n of them right, admit a proper 
pairing, i.e., one such that a left parenthesis is always paired 
with a right parenthesis and no two pairs separate each other 

thus (4G5)4) 5° this proper pairing is unique. Also it can be 
proved that in a term or formula there is always a proper pairing 
of the parentheses, by which indeed we can always find out in what 
order the parts were put together under the clauses of the defini- 


tions of term and formula. 


aaa aia aan iF 


Mathematical Foundations 122 
As abbreviations in our exposition of the metamathematics, 

and without altering the definitions of what a term and formula 
strictly are, we shall omit writing parentheses where we can with- 
out confusion, using the conventions adopted in Chapter II, p. 48, 
and others. The rank of the symbols is as follows: <—>, —>, 
& andv, ~,VWx andadx, =, +, * ”s <—> being the strongest 
and ' the weakest. ‘Thus (1) can be abbreviated to 
de( cteaek) yeast or even de cten eb eae. We 
shall however not always omit all parentheses which these con- 
ventions allow, but shall retain as many as seem desirable for 

| the greatest clarity. 

We also introduce new metamathematical symbols to permit 
abbreviations of terms and formulas. For instance, ao # L is 
an abbreviation for v i= boand wel for Gc( et +e st). 
| Here # and < are symbols used for abbreviation, not formal 
symbols. In the case of the abbreviation 4 < t , there is ambi- 
guity what variable to use as the "<", i.e., we don't know 
whether to unabbreviate by using, e.g.,@ c( clita = 4) or 
Dl( Att wo = b). However which one we use is immaterial for our 
purposes. The general rule shall be that r<s abbreviates 
| dx(x'4r=s), where x can be any variable not in r or s. Thus 
(1) can be written uw < &§—p>aft. Further useful abbrevia- 
tions are 1 for 0O', 2 for 1', i.e., for ((0)')', 3 for 2', 
etc. 

The list of formal symbols and the definitions of "term" 
and "formula" are the formation rules of our formal system, analo- 
gous to the rules of syntax in ordinary grammar. We must now give 
the definitions that establish the deductive apparatus of the sys 
tem. These definitions begin with "postulates," consisting fm ah 


Mathematical Foundations 123 


axiom schemata and a list of particular axioms, by which the class 
of axioms is defined, and the rules of inference. Then, from the 
resulting class of axioms and rules of inference, proof (of A) 


and A_is provable, deduction (of B) from Ages Ay, and B is de- 
ducible from Ays++s A, are defined as before (pps 65-68, 93); and 
+ may be used in eonnestion with Nz as before we used it first 
for the propositional calculus and then for the predicate calculus 
To begin with N, shall have all the postulates of the 
predicate calculus. That is, it shall have the three rules of 
inference, the —>-rule (p. 65), the V-rule and the G-rule (p. 93); 
and it shall have the Axiom Schemata 1-10b (p. 57), and 41, 42 
(p. 84), i-e., all formulas of these forms shall be axioms. More- 
over, we now permit as the "pr" for 41, 42 not merely a variable as 
described on p. 84 but more generally any term such that when 
substituted for the free occurrences of x in A(x) with result 


A(v), no variable in any of the resulting occurrences of r will 


be bound. For example, taking x to bed, r to be ( 


? 
and A(x) to be @t( ott w= L-) & «= & = 0, the conditions are 
satisfied; but not for the same x and r with A(x) now 
Pd( d+ oae kh) & d= 0. 

In addition to the postulates of the predicate calculus 
(with the more general rv for 41, 42), there shall be one axiom 
schema (P5) and eight particular axioms (P3, P4, El, E2, Al, A@, 


Ml, M2) as follows: 


P3. at =- & —» ge’. 
Pa, mat = OQ, 
PS. (A(O) & Wx(A(x) —> A(x')) —> A(x), 


where x is any variable, A(x) any formula, and A(0), A(x!) the 
results of substituting 0, x! respectively for the free occur- 
rences of x in A(x). 


| leaalaaiiaaaiaaaaaiaaaaiae a: 


Mathematical Foundations 124 
El. asl (est —> tec). 
E2. a= fl — at = &t. 
AR. a+QO = 4, 
A2. atk! = 
Ml. “us! O = 
Me’. arhkl = 


As we have remarked, N, 1s designed to formalize elemen- 
tary number theory. To make this explicit, we shall interpret 
the terms as being names for natural numbers (variable or con- 
stant), ' as expressing "successor of " or "41", the other arith- 
metical symbols in the familiar way, and the logical symbols 
according to the interpretations of Chapter II. Using this inter- 
pretation of Ni, let us consider the meanings of the axioms for 
number theory. 

Axiom P3, Axiom P4, and Axiom Schema P5 formalize respec- 
tively Postulates (3), (4), (5) of Peano (see pp. 25-26). Postu- 
late (1) that O is a natural number, and Postulate (2) that at 
n is a natural number, so is n+1, are taken care of instead 
by the formation rules which make O a term and whenever r is 
a term make (r)! a term, since all the terms in the system are 
interpreted as representing natural numbers. 

Axioms El and E2 are axioms for equality. We did not in- 
clude the reflexive law for equality, @ =a, because this is 
deducible from El and Al by the predicate calculus, so it is prov 
able in N,.2° From “ =o with El the symmetric and transitive 
laws can be proved. Axiom E2 assures us that the successor 


ea 


2°This proof is shown in Kleene, loc. cit.,7 p. 84; in fact it 
is the only particular proof in N, shown in full detail. The 
idea of it is easy: substitute ©@+0 a,«, for a, Ae, a in 
El, and then use Al with the —>-rule. 


| aii | 


Mathematical Foundations 125 


* ss well-defined. Axioms Al, A2, Ml, M2 provide recur- 


function 
sive "definitions" of the operations + (plus or addition) and 
+ (times or multiplication). 

% The system N, is adequate for the usual elementary num- 
ber theory such as occurs in standard texts (but not for analytic 
number theory). By this we mean that, first, the predicates and 
propositions used in informal elementary number theory can be ex- 
pressed by formulas in Ni, and, second, for those propositions 
which are proved as theorems the formulas expressing them are 
provable in N..- 

The first part calls for a little explanation. We have 
already seen that although 4 < ie is not provided directly by 
the symbolism, a formula cl tt+t+ae ey is so provided, which 
under the interpretation is clearly equivalent to a < is. By 
introducing < as a symbol of abbreviation, we can thus express 
inequalities. That " o divides i" can be expressed by 
Gela-+c = Ie ) which we abbreviate all. That "ais a prime 
number" can be expressed by 1<a@ ia c(l<c&e cK a & ee), 
which we abbreviate Pr(a). That there are infinitely many 
primes can be expressed by ak (pr(t) & aw <tr) or 
wath (pr(b)& a< tb). 

Since ', +, - are the only function symbols in N,, be- 
sides which O and variables are available, no function other 
than a polynomial can be expressed by a term of N,. This is un- 
doubtedly a limitation in the symbolism of Nj, but it can be cir- 
cumvented. For what can be expressed informally using function 
symbols can in fact be paraphrased using predicate symbols in- 
stead. Thus, say f(x1,-+++>%,) is a number-theoretic function of 


n variables, and let P(X, +++ X59) be the predicate of n+l 


a nt 


Mathematical Foundations 126 


variables which is true for exactly those ntl-tuples (x1,.--.%,,y) 
such that £(x2,--+.%)) = y. We call P(Xi,+-+ X09) the repre- 
senting predicate of f(x1,---.%,)- Then what can be stated us- 
ing £(x1,--+5x,) can be paraphrased using P(X2,+++.%, 09) 
instead (an illustration will be given below). Now it turns out 
that, although only polynomials can be expressed in N, by means 
of terms, the representing predicates P(xz,++->X, 09) of a vastly 
greater class of functions can be expressed in N,. Not only can 
the propositions using functions be expressed indirectly by using 
their representing predicates, but all the reasoning that could 
be carried out with the functions can be paralleled too. Thus it 
is that N, is adequate for the usual elementary number theory 
despite its obvious shortage of function symbols. 

One might ask: Would it not be better to remedy this defi- 
ciency by constructing another system with more function symbols? 
One can do this, but for foundational questions it is often best 
to keep the system as simple in its structure as possible. In 
fact there is a basic theorem 2? that allows us to use such sys- 
tems richer in functions than N, and construe all the results in 
terms of Nj. 

To support our claim that N, suffices for the usual ele- 
mentary number theory, of course an extended investigation is 
necessary, in which we carry out the development of number theory 
within N,.2+ 

We shall now set up another system Ne which adheres 


strictly to the symbolism of the predicate calculus, whereas N, 


217his development, beginning with a detailed proof-theoretic 
study of propositional and predicate calculus, will be found, 
e.g., in Hilbert and Bernays, loc. cit.,+% vol. I; or Kleene, loc. 
eit.,” Part II. 


Mathematical Foundations 127 


had the symbols 0,!', +, *3 4.e., Ne will have no function sym- 
bols like "+, *, and no individual symbols like O (which can 
be considered symbols for functions of zero variables). The proc- 
ess will illustrate the same ideas as those by which systems 
richer in function symbols than N, can be "reduced" to Ni; only 
now in the case of the three function symbols ', +, + of Ny we 
can't just dispense with them, but must replace them by predicate 
symbols. We begin by proceeding from N, to a system N' in which 
the function symbol + Will be thrown out as a function symbol 
but introduced as a predicate symbol. The list of formal symbols 
for N' will be the same as for Ny, except that the comma "," will 
be added. The definition of term will be altered in Clause 3 by 
omitting "(r)-(s)". The definition of formula will have a new 
clause that says, "If r,s, and t are terms, then -(r,s,t) is 
a formula." For the interpretation, " .(r,s,t)" is to mean 

| "r-g = t"; thus - (a, L t) expresses the representing predi- 
cate of the function «- L . Finally, the list of postulates for 
N' shall come from those for Ni by omitting Ml, M2 and supplying 
instead the four following: 

M3. “(&, 

M4. ad'| # Bo eh (&, Bey) 

M5. arb, c) &Vdl-(a,b,4) — c=4)) 

ES. ask —>(-(c,d,0) + (c,4,4)) 


Here MS and M4 are Ml and M2 paraphrased to use the predicate sym- 


[NSTEAD 
as of the function symbol + . Then M5 expresses that 
| for given «w, (a, .t) is true for one and only onec, or 


in other words that (a,%, e) is the representing predicate 
of a function. Finally ES is one of three formulas that express 
that - U Gaels 2) is well-defined as a predicate (the other two 


are provable in N' so they are not needed as axioms ). 


2 bu 


Mathematical Foundations 128 


From N' we can proceed similarly to a system N'' by re- 


placing + as function symbol by + as predicate symbol. From nt 


1 1 


we go to N''' by replacing ' as function symbol by | as predicate 
symbol. From N''' we go to Ne by replacing O as individual sym- 
bol by O as predicate symbol (we disallow O as a term and we 
allow O(r) as a formula for any term r, where O(r) means "r = 0"), 

The relationship between N, and Na can be expressed by a 
theorem which says there is a translation scheme by which to each 
formula A of Ny we can find a formula Ke of Nea, and to each for- 
mula A of Na we can find a formula ge of Ni, so that the following 
is true. If A is a formula of both Ny and Na, then A, A°, 
are all the same formula. Letting}, and |-2 refer to N, and Ne 
respectively, we have }iA <—> A* and |. A <> a"°, orf ki As 
then Fe A°; and if -e A, then }-z rs (and similarly for deduci- 
bility ).22 

In the rest of this chapter, we shall write simply N 
when what we say applies equally well to N, and Noe. 

A formal system for axiomatic set theory could be set up 
in the symbolism of the predicate calculus using x =y, xX «Ey, 
Wx) as the predicate symbols, 16 particular set-theoretic 


axioms and 5 particular equality axioms (cf. pp. 101-102), and 


22TMe basic step in the proof of this theorem is the proof by 
Hilbert and Bernays of the "eliminability of the notion of 'that 
which,'" or eliminability of descriptions to use Whitehead and 
Russell's term. The proof and application are given by Hilbert 
and Bernays, loc. cit.,+® vol. I, pp. 422-457 and pp. 460 ff. 
There is a treatment in Kleene, loc. cit.,7 pp. 407-414 and 

pp. 417-419. The theorem gives directly conditions under which 
the effect of P(x, 6+ ky can be secured when we have 


P(x1,-+-.%,,7) (cf. p. 126 above). That in N, we do have 
P(Xay+++ 5X0) with appropriate properties for a very wide class 


of number-theoretic functions is a consequence of work of Godel 
1931 and Kleene 1936 (cf. Kleene, loc. cit.,” %%48, 49, 58, 74.) 


anaemia | 


| Mathematical Foundations L29 
the axiom schemata and rules of inference of the predicate calcu- 
lus. 

The propositional calculus, and the predicate calculus, 
each constitute a formal system, when a suitable notion of for- 
mula is supplied.?% 

The foregoing will serve as examples of formal systems. 
Hilbert's program was undertaken by him, Bernays, Ackermann, von 
Neumann and others after 1920. They began by studying some formal 
systems of number theory, essentially the system N. Ackermann in 
1924 proved the consistency of the subsystem of N in which the 
use of P5 (the induction schema) is restricted to the case of an 
A(x) which does not contain any free occurrence of x in a quan- 
tified part VyB or HyB. The situation was illuminated in 1931 by 


the results of Godel which we shall present in Sec. 17. 


23~hus for the propositional calculus, the prime formulas, if not 
simply proposition letters, must be so constituted that they have 
a proper pairing of parentheses which will then be continued as a 
proper pairing in constructing composite formulas (of. ps Lei). 
For the predicate calculus, there is the further requirement that, 
if what we called "prime predicates" on pp. 72-73 (now we should 
prefer to say they are formulas functioning as names for “pr 
predicates") are not simply predicate letters or predicate sywhols, 
| they not be of the forms (A) <—>(B), (A) > (B), (A) & (B); 

| (aA) V(B), ~(A), Wx(A), ax(A); and that in respect to their vavia- 
bles they meet the condition that the formulas built from them 
could arise by proper substitution (pp. 82-84) from formulas con- 
structed in the version of the calculus using predicate letters. 
It is hard to be precise here in completely general terms, and 
easy to deal with particular examples like those provided by the 
definitions of formula for N, or Ne. There are formal systems in 
| which operations that bind variables occur in the construction of 
terms, e.g., in Hilbert, D. and Bernays, P., Grundlagen der Math- 
ematik, Vol. II, Springer, Berlin, 1939, p. 293, te (A) is used to 
express "the least x such that A, if such an x exists, and O 
otherwise." , 


ss 


Mathematical Foundations 130 


Turing machines, Church's thesis 

Consider a given countably infinite class of mathematical 
or logical questions, each of which calls for a "yes" or "no" 
answer. Is there a method or procedure, which can be described 
in advance, such that, when any question of the class is selected, 
by following the procedure one will be led in a finite number of 
steps to the answer to that question? If such a procedure exists 
it is called a decision procedure or algorithm for the given class 
of questions. The problem of discovering a decision procedure is 
called the decision problem for this class. 

For example, there is a decision procedure for the class of 
questions "Does .a divide b?" where a and b are positive 
integers. It consists in performing the ordinary division of b 
by a and observing whether or not the remainder is 0. Likewise 
there is an algorithm for determining whether or not a polynomial 
equation with integral coefficients has a rational root. Again 
there is a decision procedure, based on Euclid's algorithm, for 
deciding whether or not, given integers a, b, c, the equation 
ax + by +c¢ =O has integer solutions x, y- 

For a formal system S, consider the following three general 
questions, i.e., classes of particular questions: "Is a given 
formal expression a formula?", "Is a given sequence of formal ex- 
pressions a proof?", "Is a given formula provable?" As we indi- 
cated in Sec. 14, any given question of the first class can be 
answered by seeking a proper pairing of the parentheses, and if 
one is found attempting with the help of it to retrace the steps 
by which the expression, if it is a formula, must be built up 


under the definitions of term and formula. To answer any given 


Mathematical Foundations 131 
question of the second class, we merely consider each formula in 
the given sequence in order to see if it is an axiom or follows 
from earlier formula(s) by one of the rules of inference. The 
objects which must be examined to answer a question of either of 
these classes are contained as parts of the finite object to which 
the question applies. 

The third class of questions is fundamentally different; 
to show that a formula is provable, one must exhibit a proof — 
it, but the proof, if there is one, need not be made up only of 
parts of the formula itself. One must therefore look elsewhere 
than within the given object to answer the question. The defini- 
tion of a proof of a given formula sets no bound on the length of 
the proof, and to examine all possible proofs without bound on 
their length is not a procedure which leads to the answer to the 
question in finitely many steps in the case the formula is not 
provable. So this third decision problem, i.e., the decision 
problem for provability in the system §, unlike the first two, is 
not trivial. If there is a decision procedure, it is one which is 
not afforded almost immediately by the definition of provable 
formula. Also this decision question for a formal system S is 
of especial interest. So it is often called the decision problem 
for the formal system. 

For provability in the propositional calculus there is the 
decision procedure found by Emil Post in 1921; thus to determine 
whether or not} A, it suffices, since } A if and only if FA, by 
Theorems 12 and 13, p. 70, to compute the truth table for A, and 
see whether or not it has all t's (p. 51). 

The recognition of the decision problems for formal systems 


goes back to Schroder 1895, Lowenheim 1915, and Hilbert 1918. 


Mathematical Foundations 132 


It would be especially rewarding to have a decision pro- 
cedure for the system N of elementary number theory. For then the 
solution to every problem of the usual elementary number theory 
calling for a "yes" or "no" answer could be obtained automatically - 
For instance, the correctness of Fermat's conjecture (p. 3) could 
be established or refuted by applying the decision procedure to 
determine the provability in N of a formula which expresses "for 
all x, y, 2, n, 2< n implies xD 4 yD # z™." such a formula in 
N, is 

Va Vy Wy¥n( 2c —> VAL Ys4%>™)) 
where ACy 34 4 ,w) is the result of paraphrasing (cf. pp. 125-6) 
: e + 4 = x to avoid the exponent function. Thus to get 
A(u 249% ,.) we first find by known methods®* a formula 


E(a ,.7,4) expressing =a; then A(x syd) is 


Tudvdw(E(p,n,%) & E(4 ana) & E(4 aww) & we =v). 

The efforts expended over centuries in searching for solu- 
tions to this and other famous problems of number theory make it 
implausible that a decision procedure for N should exist. It 
might have seemed equally implausible in 1918 that mathematics 
could find a way to prove that there can be no decision procedure 
for N. But exactly this was done by Church on the basis of a 
thesis of his which we shall propound next. 

We begin by observing that just as we may have a decision 
procedure or algorithm for a countably infinite class of questions 
each calling for a "yes" or "no" answer, we may have a computation 
procedure or algorithm for a countably infinite class of problems 
which require as answer the exhibiting of some object. The prob- 


lem of finding a computation procedure for such a class of ques- 


ee 


acta aaa aaa 


Yv 


Mathematical Foundations 133 
tions is the computation problem for the class of questions, or in 
other words for the function whose values are the objects sought 
as the answers to the questions. Any decision problem for a class 
of "yes" or "no" questions, or in other words for a predicate, can 
be handled as the computation problem for the representing func- 
tion of the predicate, which takes O or 1 as value according as 
the value of the predicate is true or false. In other words, a 
decision procedure can be handled as a computation procedure by 
taking O for "yes" and 1 for "no." 

As the classes of questions are countably infinite, we lose 
no generality by taking the parameters (or when more convenient, 


"normalize" the 


several parameters) to be natural numbers. So to 
notion of a computation procedure, and therewith of a decision 
procedure, we can work on the case of computation procedures for 
number-theoretic functions. So we ask: What number-theoretic 
functions can be computed by preassigned rules? That is, for what 
number-theoretic functions is there a procedure describable in ad-+ 
vance such that, if we choose a set of natural numbers as argu- 
ments, i.e., as values of the independent variables, then the pro- 
cedure leads us to the corresponding value of the function in a 
finite number of steps, each determined by the preassigned rules. 
In many special cases, we can agree on the basis of this somewhat 
vague and intuitive description that a computation procedure is 
known, e.g., for x +1, x?, x+y, vx, [e*| ( = the greatest 
integer less than or equal to e*), 

But to say what we mean by all possible computation pro- 
cedures, as we must do before we can hope to prove that there is 
no computation procedure for a certain function (or no decision 


procedure for a certain predicate) is another matter. Another way 


m8 


| ieee iti) 


Mathematical Foundations 134 — 
to put the question is: What is the class of all computable 
functions? Vaguely we have an idea of what it means for a func- 
tion to be computable; i.e., there is a computation procedure for 
it, which of course need not consist in a direct application of 
the definition of the function, but may be considerably different 
from this. 

Now the historical situation in 1935 was that a certain 
exactly defined class of computable number-theoretic functions 
considered by Church and Kleene during 1952-35, called the 
\-definable functions, had been found to have properties strongly 
suggesting that it might embrace all functions which can be re- 
garded as computable under our vague intuitive notion. (This 
result was somewhat unexpected, since initially it was not clear 
that the class contained even the particular computable function 
x 21 = max (x-1, 0), and a proof that it did was the speaker's 
first piece of mathematical research.) Another class of computa- 


ble functions, called the general recursive functions, defined by 


Godel in 1934, who built on a suggestion of Herbrand, had similar 
properties. It was proved by Church and Kleene that the two 
classes are the same, i.e., each \-definable function is general 
recursive and vice versa. 

Under these circumstances Church proposed the thesis (pub- 
lished in 1936) ,that all functions which intuitively we can regasr— 
gard as computable, or in his words "effectively calculable," are 
)\-definable, or equivalently general recursive. This is a thesis 
rather than a theorem, in as much as it proposes to identify a 
somewhat vague intuitive concept with a concept phrased in exact 
mathematical terms, and thus is not susceptible of proof. But 


very strong evidence was adduced by Church, and subsequently by 


Sa 

Mathematical Foundations 135 
others, in support of the thesis, which has come to be generally 
accepted by workers in foundations. 

A little later, but independently, a paper of Turing 
(1936-37) appeared in which another exactly defined class of 
Finsbicns, which we shall call Turing computable, was introduced, 
and the same claim was made for this; this claim we call Turing's 
thesis. It was shortly shown by Turing that his computable func- 
tions are the same as the A-definable functions, and hence the 
same as the general recursive functions. So Turing's and Church's 
thesis (or two theses) are equivalent. We shall usually refer to 
the thesis as Church's thesis, or in connection with that one of 
its three versions which deals with Turing machines, as the Church 
Turing thesis. Post in 1936 independently of Turing published 
rather briefly a formulation fundamentally the same as Turing's. 
(In 1943 he published a fourth equivalent, going back to earlier 
unpublished work of his. There are still other equivalent formu- 
lations.) 

Turing's machine concept arises by a direct effort to an- 
alyze computation procedures as we know them intuitively into 
elementary operations, repetitions of which Turing argued would 
suffice for any possible computation. For this reason, "Turing 
computability" suggests the thesis more immediately than the other 
equivalent notions, and so we choose it for our exposition.?* 

Turing described a kind of theoretical computing machine. 
This differs in two respects from a human computer working under 


preassigned instructions or an actual digital computing machine 


24Tn Kleene, loc. cit.,” the attempt was made to summarize all 
the evidence for Church's thesis: ° 62 gives a general summary, 
supplemented by p-. 352, and by ¢70 which elaborates the part of 


the evidence which pertains to Turing's machines. 


Um 


Mathematical Foundations 136 
(either a desk calculator, or a more modern device, with elec~ 
tronic tubes, or transistors etc.). First, it is not liable to 
errors. Second, it is given a potentially infinite memory. That 
is, although the amount of information stored at any one time is 
finite, there is no upper bound on this amount. A description of 
such a machine follows. 

We number moments for the operation of the machine as 0, 
1, 2, ... . At any given moment, the machine shall be in one of 
k + 1 states, which we number 0, 1, .--, k.- State O we call the 
passive state; the others, active states. A linear tape ruled in 
squares passes through the machine (when it is set up for opera- 

. tion). The tape is potentially infinite to the right. Each 

square is either blank or has printed on it one of a given finite 

list of symbols sj], ---+, 853 however, only a finite number of 
squares are printed at any moment in the use of the machine. At 
each moment, beginning with moment 0, one square of the tape is 
scanned by the machine. Now consider any moment when the machine 
is in one of its active states 1, ..., k- Between this moment 
and the next, the machine performs an act consisting of a sequence 
of three operations (a), (b), (c), each operation coming from the 
respective category as follows: (a) print one of the symbols 

Bise++s84 on the scanned square (supposed blank at the given 

moment), or erase the scanned square (supposed printed at the 

given moment), or erase and print on the scanned square (supposed 
printed at the given moment), or make no change in the scanned 
square; (b) move the tape so as at the next moment to scan the 
square next to the left of the scanned square (briefly, move 
left), or leave the tape unmoved (briefly, stay centered), or 


move the tape so as at the next moment to scan the square next to 


eee 

Mathematical Foundations 1ST, 
the right of the scanned square (briefly, move right); (c) change 
to another state, or remain in the same state. What act (within 
these possibilities) is performed, between a given moment at 
which the machine state is active and the next, is determined by 
what one of 1,...,k this machine state is, and what condition 
the scanned square is in (i.e., whether blank or printed with 
8, Or Sap Or .s. 85) at the given moment. If at a given mo- 
ment the machine is in the passive state 0 no act is performed 
between this and the next moment, i.e., the machine does not 
print or erase, does not move, and does not change from state 0. 
We shall give an illustration presently. 

First however let us define how such a machine is to be 
used to compute a number-theoretic function. (Turing used his 
machines primarily to carry out continuing computations of deci- 
mal representations of real numbers.) For this purpose, we must 
agree how the argument(s) or value(s) of the independent varia+ 
ble(s) are to be represented on the tape, and how the machine is 
to give us the resulting value of the function. We shall make 
the supposition that all machines to be considered have among 
their symbols the tally mark "|". We shall represent natural 
numbers by @ sequencefof tallies, 4 for 0, |{ fori, ||! for 2, 
+e.» To set up the machine and tape to compute for a given 
argument x, we shall arrange that: at the moment O the system 
consisting of machine and tape is started off so that tle left- 
most square of the tape is blank; x is representea by tallies on 
the next x +1 squares, and all squares to the right of these 
are blank; the machine is scanning the last printed square, and 
is in its first active state 1. (In this situation, we say the 


machine is applied to x _ as argument.) We say the machine 


| alae ai i 


Mathematical Foundations 138 
computes a value z for x as argument, if, from this configu- 
ration at moment 0, the machine at some later moment eventually 
assumes its passive state O with a blank and z2+1 tallies 
printed on the tape after the x +1 tallies representing the 
argument x, the tape being otherwise blank, and the last printed 
square being again the scanned square. 

A given machine may compute a value for each natural number 
x as argument, or for some x's but not for others, or even for 
no x's. If, for each x, it computes a value z, and z = f(x), we 
say that the machine computes the function f(x), and that f(x) 
is Turing computable. Similarly for functions of more than one 
variable. 

For example, a machine is applied to 1 as argument, if at 
moment O it is in the following configuration 


LETT TTT 


where the "1" written over the third square shows that this is 
the scanned squared and the machine is in state 1; and all squares 
to the right of those shown are blank. A machine computes the 
value 2 for 1 as argument if, having been started at moment O in 
the configuration above, it eventually at some moment y reaches 


the configuration 


CONTE TT 


where again all squares to the right of those shown are blank. If 


in similar fashion, when the machine is started with any x+1 
tallies on the tape, it stops with these followed by a blank and 
x + 2 tallies, the machine computes the function f(x) = x +1; 


the above illustrates this for x =1. 


| eimai aims mim 
Mathematical Foundations 139 
Now we shall describe a machine i which does compute this 

function f(x) = x +1 (the successor function) and follow it 

through its acts in computing for the case x =1. The machine 

will have only the one symbol " yon To save drawing a picture of 
the tape each time, we shall use sequences of O's and 1's where 
the O's stand for blank squares and the 1's for squares printed 
with tallies. Thus in the illustration above we would write the 


initial configuration 
o1ia%0900 0 


and the configuration at the moment the computation is completed 


Oi ito. 2 2 


To describe a machine we need merely tell for each of its 
active states 1, ..., k and each of the j +1 scanned-square 
conditions (blank or printed with si, ---, s;) what act it is to 
perform. For the machine we describe now there are to be 11 
active states, and 2 conditions of the scanned square, namely, 
blank and printed with | (or, as we write them more conveniently, 
QO and 1). The acts it is to perform for each of the combinations 
of these 11 states and 2 scanned-square conditions can be shown 
by the following machine table (at the left on p. 140). In the 
table P means "print," E means "erase," L, C, R mean "left," 
"center" (i.e., don't move), "right," and the number at the end 
of each table entry is the state the machine is to assume at the 
succeeding moment. 

At the right on p. 140, we follow out the acts of &d in 
computing x +1 for the argument x = 1. At moment 0, in the 
sample computation at the right, we see that a printed square 


(i-e., a 1) is scanned in state 1; so we enter the machine table 


140 


Mathematical Foundations 


SAMPLE COMPUTATION FOR MACHINE 8 , 


MACHINE ,) 


for x=1 


f(x) =x +1 


which computes 


Configuration 


Oo. eo o o 
o 110000 
OLlL oO OO 0 
vt bh Of Ee 


Moment 


Tape Condition 


Bg 


R2 


Ro 


RS 
PL4 


OF 1 o 1 0 O 
Oo 116 © 0 6 
OT ag 2 oD 


RS 


L4 


LS 


od 6 OL 6 0 
6 2 6 6 2% 6 © 


L6 


LS 


R7 
ER7 


R2 


62.6 @ 2 oa 


R8& 


O02 © 6 2*2 
od 0 O11 21 6 


10 
11 
Le 
13 
14 
15 
16 
17 


RS 


R8 
PRO 


6 1 OF O 1 tO 
6 1°09 0 41 0 


L1o 
ERI11 


co 
PCO 


10 
11 


R11 


Oo 1FO © 12 0 


oO 2 OF Oo Lt tO 
oi kh OL dL @ 


Of ft 2 Iho 
o 1 12 2°91 1 °0 
od kk © IL 6 


18 
19 
20 
al 
22 
23 


Oi i @ 3 aso 


oli wet 2 of 


6 2 Le 2 a 


isa 
¢2 2 Oo 2 a 3 


oOo bt £ O & 


24 
25 


7 eiecamcaiaiaa aaa 


Mathematical Foundations 141 
in the first row and second column finding "R2", i.e., go right 
and assume state 2. The result is shown in the sample computation 
opposite moment 1. Now we have a blank square scanned in state 2, 
so we enter the table in the second row and first column, finding 
mpg" i.e., go right and assume state 3. The result is shown 
opposite moment 2. Now the scanned square is still blank but the 
state is 3, so by the table (third row, first column, where it 
reads "PL4"), the sunset ae al, goes left, and assumes state — 
4, with result shown at moment 3. Continuing, we find that at 
moment 23 the machine has ~computed the desired value 2 
(=x +1 for x =1) shown by three tallies. To see that the ma- 
chine computes f(x) =x+1, we must convince ourselves that it will 
compute the correct value for every value of x as argument. We 
have done this only for x =1, but the reader should not find it 
too hard to get the "hang" of how this machine operates to see 
that it does so for every x. 

To illustrate computation of a function of 2 variables, a 


machine to compute f(x,y) =x + y, e-g., when started in the 


configuration 


which illustrates the computation for x = 3, y =l. 

The machine 3 that computes x +1 is sufficiently com- 
plicated so the reader may well wonder how to set up machines to 
compute complicated effectively calculable functions. Of course 


we are only interested in the theoretical possibility of finding 


a 


Mathematical Foundations 142 
a machine to compute any given "effectively calculable function" 
and not in whether the machine operates efficiently. The business 
of finding machines can be systematized by starting from the the- 
ory of recursive functions. This theory deals with recursive 


definitions of functions such as 


o) | x? 


(x+y)! xs yl=xy +x ls’ = 


x +O. 


a 


% x: 0 


Li] 
B 


x+y! 


1 
tal 


The functions commonly used in number theory are definable by use 
of such recursions, and proceeding from the recursive definitions, 
one can in a systematic way find corresponding Turing machines, 
after first setting up Turing machines for such simple operations 
as filling in with tallies all but the rightmost of a sequence of 
plank squares followed by a tally, copying a sequence of tallies, 
etc. 

In giving the arguments to the machines or receiving the 
values, we have used only one symbol | besides the blank (i.e., 
two square conditions O and 1), and likewise in the action of 
machine As The definition of Turing machines allows more symbols. 
To write a machine table for a machine with j symbols 
Bross 284s there will be j +1 columns. In the case j>1, 

"p" 45 ambiguous, and we can use Pi for "print 8,", where "i" can 
be in decimal notation. 

While we are thus leaving open the possibility of using 
j+1> 2 tape conditions, instead of only two, the fact is we 


get no larger class of computable functions thereby .?° 


25This is shown in Kleene, loc. cit.,”’ Chapter XIII, by using only 
the one symbol | in proving that every general recursive function 
is Turing computable, while allowing j symbols s,,...,s, in 
proving that every Turing computable function is general J 
recursive. 


ne 


Mathematical Foundations 143 
16. Church's theorem 
We have seen that the pattern of behavior of a given Turing 
machine is determined by the table for it; if we know the table, 
essentially we know the machine. 
The table for a machine can be written in code form. To do 

this, we simply write down all the entries, in order, a row at a 

time, with commas separating the entries within each row, and semi 

colons separating the rows. For example, the table for machine /, 

used for illustration in Sec.15, becomes in code form: 
CO,R2;R3,R93PL4,R3;L5,L4;L5,L6;R2,R7;R8,ER7;R8,R3;PRI,L10;CO,ER11;PCO,Ru 

The code for any machine can be written on a typewriter with the 

following 17 symbols: 

@nbe S45 & TBO PEE CR 3g FG 

By reinterpreting these symbols as the digits of a number in the 
number system based on 17 we get a positive integer which de- 
scribes the machine table and thence the pattern of behavior of 
the machine; call this number the index of the machine. (We have 
chosen this particular method of indexing as being easy to ex- 
plain. If we were to deal with the subjects below in greater de- 
tail, it would be appropriate to select a method to make the work 
as easy as possible. Some other systems are more commonly used 
in the literature. but we are not sure the present system would 
actually be harder to work with.) 

Now let T(i,x,y) stand for the following: 


i is the index of a Turing machine (cali it 
machine M, ) which, when applied to x as 
argument, will at moment y (but not earlier) 
have completed the computation of a value 
(call that value £,(x)). 


Mathematical Foundations 144 

This predicate (i.e., propositional function) T(1;x,v) 1s 
decidable. For suppose values of i,x,y are given. Then we can 
determine whether i, in the 17-system of notation, does describe 
a table for a machine. If it doesn't, T(i,x,y) is false. If it 
does, then we can follow out the operations performed by that ma- 
chine My (as in the illustration in Sec. 15), starting it at 
moment © to compute for x as argument, and continuing to moment 
y- Finally, in this case, we can see whether at this moment it has 
just completed the computation of a value. If so, T(i,x,y) is 
true; if not, false. (For example, if i is the number shown 
above in the 17-notation, then T(1,1,23) is true, but T(i,l,y) is 
false for every y pf 23.) 

This should make it clear that T(i,x,y) is decidable intui- 
tively. That a machine exists to decide it, i.e., to compute its 
representing function, is then implied by the Church-Turing thesis. 
(Of course a full treatment of this subject would call for showing 
this without appeal to Church's thesis. One would not do this 
from scratch, but by taking advantage of the theory developed for 
such purposes. ) 

The number f, (x) is not defined for every i and x; 
indeed it is defined, for a given i and x, exactly if there 
exists a y such that T(i,x,y), or in symbols exactly if 
(Ey)T(i,x,y). (In Chapter II we would have written this 
HyT(i,x,y), but here we prefer to alter the logical symbolism, so 
that ay can be saved for use in formal systems, while (Ey) is used 
informally in discussing those systems. For informal mathematics, 
we shall also use (y) instead of vy, A instead of ~A, A =B in- 
stead of A <—> B.) 


Se 


Re 


Mathematical Foundations 145 

To repeat, f, (x) is defined exactly if (Ey)T(i,x,y). So 
it is a partially defined number-theoretic function, of two vari- 
ables i and x, or briefly a partial function. However, for i 
and x for which it is defined, we can find its value, thus: 
given such i and x, from i we find the table for machine M,, 
and then apply (or imitate) M, by performing its steps, starting 
at moment O with x as argument, up to the moment y for 
which T(i,x,y) is true. 

As a digression, we remark that this implies, via the ex- 
tension of the Church-Turing thesis to partial functions, that 
there is a machine which computes £,(x) as a partial function of 
i,x. Turing showed directly that such a machine exists, though in 
a little different situation. In our situation, it is a universal 
machine 74 which can be used to compute any computable func- 
tion g(x). To use it to compute g(x), say that g(x) is com- 
puted by machine My then apply 2/ to compute for i1,x as a pair 
of arguments. Thus now i plays the role of instructions to Vi 
which te11 what function of x to compute. 

Theorem 1. The function f(x) defined by 

id: 3 ie +1 if (By)T(x,x,y), 
QO otherwise 
is not computable. 

Proof. Suppose f(x) were computable; say machine My com- 

putes it, so that f(x) = fy(x)- Substituting q for x, 

f(a) = f4(a)- 
But since My computes f(x), we have, for all x, (Ey)T(a,x,y), 
and in particular (Ey)T(qa,q.y)- Using this in the definition of 


f(x), we have 


f(a) = £,(a) +1. 


Oe EEAaAaEacLCLCLCAC———— 


ae amen ait aie it 


Mathematical Foundations 


The two displayed equations contradict each other. 


is f,(a) +1 by the definition of f(x). 


that can't be built into the machine. 


are right, there must be some values it can't compute. 


146 


To state the proof a little differently, we can consider 
each Turing machine M,> and see that it will fail to compute f(x) 
correctly for x =q. To begin with, My may fail to compute any 
value for q as argument. But if My does compute a value for 
q as argument, that value is f,(4), by the definition of £,(x)5 


and in this case (Ey)T(a,q,y), so that the correct value of f(q) 


The significance of this result comes from the Church- 
Turing thesis, by which computability in Turing's sense agrees 
with the intuitive notion of computability. Accepting the thesis, 
as workers in foundations do, the director of a computing labora- 
tory must fail if he undertakes to design a procedure to be fol- 
lowed, or to build a machine, to compute this function f(x) 
This refutes the notion, which news reports on modern developments 
in high speed computing tend to foster in the public mind, that 
machines can do everything. The theorem does not assert that 
there is any particular value of f(x) that we can't learn; but 
in whatever model we freeze the design of a computing procedure . 
or machine, we will be short of having a procedure or machine 


that can compute all values of f(x) - if the values it computes 


To improve 


the procedure or machine must take ingenuity, i.e., something 


It should have been apparent that there must be uncomputa- 
ple number-theoretic functions, as soon as we met the Church- 
Turing thesis in Sec. 15 by which the different possible machines 
are countable in number, ,because each is describable by a finite 


table in a fixed symbolism. The machines being countable, the 


ee 


| aaa acca iia 
Mathematical Foundations 147 
functions computable by machines are countable, while the set of 
all the number-theoretic functions is uncountable (Sec. 3). But 
it remains of interest to see how simple examples of uncomputable 
functions we can give. Our example (Theorem 1) is really quite 
simple, as it is obtained by completing the definition (by the 
"= 0 otherwise") of a suitable computable partial function. 
It is instructive to go over to the corresponding question 
for predicates. 

Theorem 2. The predicate (Ey)T(x,x,y) is undecidable, 

i.e., the function 
0 af (Ey)T(x,x.y), 
a(x) = re 
i otherwise 
is uncomputable. 

Proof. If we could decide (Ey)T(x,x,y), we could compute 
the f(x) of Theorem 1 thus: given x, decide whether 
(Ey )T(x,x,y) or not; then if this decision gives "yes," imitate 
the behavior of M, for x as argument to compute f(x), and add 
1 to the result, if "no," write 0. 

This is essentially Church's theorem, which appeared with 
his thesis in his 1936 paper entitled "An unsolvable problem of 
elementary number theory." The difference is that we have given 
an example in terms of Turing computability, whereas Church's ex- 
ample was in terms of "\-definability." The problem that is 
"unsolvable" is to find a decision procedure for the predicate 
(Ey)T(x,x,y). Of course the problem is solved in another sense, 
by its being shown that there cannot be the required decision pro- 
cedure. The problem of trisecting the general angle by a ruler 
and compass construction is unsolvable in one sense, but in an- 


other is solved by its being shown that the required construction 


De 


Mathematical Foundations 148 


CC 


cannot exist. 
We conclude this section by applying Church's theorem (as 
he did) to the decision problems for two important formal systems. 
Theorem 3. If in the formal system N of number theory 
only true formulas are provable, then there is no decision proce- 


dure for provability of a formula in N; i.e., the decision prob- 
lem for N is unsolvable. 

More generally, there is no decision procedure for prova- 
bility in any formal system §$ in which, to each x, there can be 
found a formula By which is provable in § when and only when 
(By )T(x, x,y a ~ 

In the first part of the theorems "true" of course means 
under the usual interpretation of the symbolism of N. 

Proof. Each of the propositions (Ey)T(x,x,y) for 
x = 0,1,2,... is expressible in the symbolism of N, under the 
usual interpretation, by a formula BL: Of course, to show this in 
detail would require an extended study of decidable predicates and 
computable functions, which is beyond the scope of these lec- 
tures.?© Suffice it to say that if there weren't such formulas 


B. the symbolism of N would be inadequate for elementary number 


a? 
theory. 

If, for a given x, (Ey)T(x,x,y) is true, then we can prove 
it informally, by the mechanical process of exhibiting the compu- 


tation steps of M, applied to x up to the moment y at which 


2éSuch a study is found in Kleene, loc. eit.,7 Part III. Specif- 
ically, one can show T(x,x,y) primitive recursive (Chapter IX) 
using the technique of 369, and then apply Corollary Theorem I, 
p. 242. The predicate written (Ey)Ti(x,x,y) in Chapter XI, 
pp. 281 ff. plays a role for general recursiveness analogous to 
that of the (Ey)T(x,x,y) of these notes for Turing computabil- 
ity. (In some papers 1, is written simply 1; the notation goes 
Rae a 1956 paper of Kleene which slightly antedates Turing's 
paper}. 


| iia etal eee 


Mathematical Foundations 149 
a value has just been computed. Now this informal proof can also 
be carried out within N (i.e., formalized), so that we can say: 
(a) If (Ey)T(x,x,y), then B, in N. 

Of course to show this would require a detailed investigation of 
the proof theory of N, which we can't give here.27 But if it 
weren't the case, the deductive apparatus of N (i-e., the list 
of its axiom schemata, axioms, and rules of inference) would be 
inadequate for elementary number theory. 

The hypothesis that in N only true formulas are provable 
gives us the following, since under the interpretation BL ex~ 
presses (Ey)T(x,x,y): 

(b) If B, in N, then (Ey )T(x,x,y)- 

Now suppose there were a decision procedure for N, i.e., 

a procedure by which, given any formula A of N, we can decide 
whether | A in N or not} A in N. Then we could, given any x, 
decide whether + B, in N (in which case by (b), (Ey)T(x,x,y)) or 
not (in which case by (a), (Ey)T(x,x,y) 1-e., not (Ey)?(x,x,y)). 
Thus we could decide, given x, whether or not (Ey)T(x,x,y), which 
is contrary to Theorem 2. 

Theorem 4. (Church 1936, Turing 1936.) There is no 
decision procedure for provability (and hence, by Chapter II, 
Theorem 25, and Corollary 1 of Theorem 26, pp. 96-97, for vaiid- 
ity) in the predicate calculus. 

Proof. Take N tobe Ne (Sec. 14). Then the B, of the 


proof of Theorem 3 is a formula in the symbolism of the predicate 


27see Kleene, loc. cit.,” Part II with 3249, 59 of Part III. 
Specifically, If the method of the preceding footnote was fol- 
lowed to get By» Corollary Theorem 27, p. 244, may be used now. 


Mathematical Foundations 150 


calculus as we studied it in Chapter II. 

Detailed consideration would show that the informal proof 
of (Ey)T(x,x,y), as given in the proof of Theorem 5 for each x 
for which (Ey)T(x,x,y) is true, can be formalized as a deduction 
rHoi a suitable finite list Base AL of closed formulas independ~ 


ent of x. Thus:?® 


(c) If (Ey )T(x,x,y), then Aasee AL Bin the predicate calculus. 


*4 

Under the usual interpretation of the predicate letters of 
Ne with a suitable interpretation of the additional predicate let- 
ters which occur in Aj, «++, Au all of Aa, «+s; An will be 
true, and B, will be true only if (Ey)T(x,x,y). Hence 
Arse. oA, = B, only if (Ey )T(x,x,y)- Applying Chapter II 
Corollary Theorem 23, Theorem 25, and Corollary Theorem 21 to re- 
place — here by F : 
(a4) Te Aryee GAD r B, in the predicate calculus, then 

(By )T(x,x,y)- 

A strictly "metamathematical" proof of (d) (not using ) can also 
be given, though it is long.2® 

From (c) and (d) by Chapter II Corollary Theorem 25 and 
Theorem 24, respectively: 
(e) If (By)T(x,x,y), then b Ay —> (A2 —> --- (A, > BY) ++) 


in the predicate calculus. 


(f) If HA: — (Az —>.--- (a, — B,).++) in the predicate cal- 
culus, then (Ey)T(x.x,y).. 
So if there were a decision procedure for provability in 


the predicate calculus, by applying it, given any x, to decide 
Fe ca i i ee 


28Te details appear e.g. in Kleene, loc. cit.,” Part IV, with 
certain results of Parts II and III, ¢f. p. 434, Remark 2. 


Oe  ————————  ———E 


Mathematical Foundations 151 


ni 


whether or not A, —> (Az — ..- (A, —> B,)-++) is prevable, we 
gould by (f£) and (e) decide whether or not (Ey)T(x,x,y), contrary 
to Theorem 2. 

These undecidability results arose first (like Theorem 2) 
directly in connection with the new notions of \-definability, 
general recursiveness, and Turing computability, and next (like 
Theorems 3 and 4) for decision problems for formal systems. 
Church, shortly after obtaining his results, wrote the speaker 
that he hoped they would lead to undecidability results for some 
existing problems of mathematics that have arisen from outside the 
field of logic and foundations. This hope was fulfilled, begin- 
ning in 1948 when Post and A. A. Markov (the younger) independent- 
ly of each other showed the "word problem for semi-groups" to be 
unsolvable. A succession of further results in this direction 
has been obtained, one by Turing and a number by Russians, cul- 
minating in the announcement by Novikov in 1952 that the "word 
problem for groups" is unsolvable (a lengthy Russian article 
claiming to give the proof very recently appeared). A somewhat 
different direction has been taken by the very active school un- 
der Tarski at Berkeley, California, and including Mostowski at 
Warsaw, who show the undecidability of a variety of algebraic 
theories as formalized using the predicate calculus.2° 
Godel's theorem 

In this section we shall present the two famous incomplete- 
ness theorems of Godel (1931). Godel gave them for "Principia 


Mathematica and related systems." We shall begin by giving the 


— 


2°7arski, A., Mostowski, A., and Robinson, R. M., Undecidable 
theories, North Holland Publishing Co., Amsterdam, 1953. 


Mathematical Foundations 152 


a 


first of these theorems ("Godel's theorem") in a generalized form 
(Theorems 5a, 5b) applicable to all formal systems in which cer- 
tain number-theoretic propositions can be expressed. Such gener- 
alized Godel theorems, which are reached from the standpoint 
provided by Church's thesis published in 1936, were given by 
Kleene in 1936 and 19439° 

A formal system like N we call complete, if for each 
closed formula B, either + B or} ~B in the system. The restric- 
tion to closed formulas is made because, e.g. in Ni; we would not 
want either of Hy(x = 2y) (which expresses "every number x is 
even") and vay(x = 2y) (which expresses "every number x is not 
even," i.e.,"every number x is odd") to be provable; however, 
one of Wx8y(x = 2y) and ~Vxay(x = 2y) should be provable. (Which 
one?) The restriction to systems "like N," besides providing that 
-~is a symbol, is to exclude such systems as the propositional and 
predicate calculi with prime formulas or predicate letters which 
behave for the interpretation like free variables; e.g., in the 
propositional calculus, when B is a formula whose truth table 
has neither all t's nor all f's, neither B nor ~B is provable. 
The following theorem asserts the incompleteness of any system S 
satisfying its hypotheses. 

Theorem 5a. Let s be any formal system, in which Y ex~- 


presses "not," and in which to each x we can find a closed 


formula B, expressing (Ey )T(x,x,y). Suppose that in s, for 


ee 


8°an excellent exposition of Godel's theorem along the lines of 
Godel's 1931 paper is in Nagel, Ernest and Newman, James R., 
Godel's proof, Scientific American, June 1956, pp. 71 ff. (We 
have one minor criticism: a misleading impression is created on 
p. 86 that undecidability results like Theorem 2 are implied by 
Gddel's theorem in the 1931 version, and thus without the Church-~ 
Turing thesis.) fp sral 


/ 


Mathematical Foundations 155 
each x, each of B, and ~B, is provable only if true. Then it is 
absurd that for all x, either B, on vB, is provable in Ss. 

Proof. From the nature of a formal system S, it follows 
that all the proofs in § can be written down systematically, one 
after another, in order, i.e., they can be effectively enumerated. 

Now suppose that, for each x, either |- B, or ~By: Then 

given any x, if we search through the proofs in S in order, we 

will eventually come to a proof of B, or of ~BL: If this proof is 

OL Ble then, since B. is only provable if true and BL expresses 

(Ey )T(x,x,y), we shall know that (Ey)T(x,x,y). Similarly, if it 

is of “Bs then we shall know that (Ey )@(x,x,y). Thus, under our 

supposition, we would have a decision procedure for whether or not 

(Ey )T(x,x,y), contrary to Theorem 2. 

This is Godel's theorem in a negative form. By a more 
careful analysis we next obtain it in a positive form, according 
to which we can find an x (which we call q) such that neither 
+ B, nor |-«B,. (In intuitive mathematics, (Ex)P(x) is stronger 


than (x)P(x).) 


formula B, expressing (By)T(x,x,y)- Suppose that in Ss, for 
each x, each of B, and ~B, is provable only if true. ‘hen there 
is @ mumber g such that (a) ~B, 4s true, but (b) ~B, is 
unprovable in S, and (c) By is unprovable in S. 

Proof. Consider the aouputetion procedure which consists, 
given x, in searching in order through the proofs in S for a 
proof of Bao and if such a proof is found, then writing 0. A 
Turing machine, say machine M., carries out this procedure, given 


qa 


any xX as argument. Let My be given q as argument. 


Mathematical Foundations 154 


Now we have: 


(1) (3, is true) = (Ey)T(qa,q.y) 
= (My applied to q computes a value) = (- ~By)- 


The first equivalence is because, for each x, BL expresses 

(Ey )T(x,x,y). The next is by the definition of T(i,x,y) in 

Sec. 16. The last is by the choice of My as a machine which car- 
ries out the described computation procedure. Taking negations 
an Ci)... 

(44) (By is false) = (~B, is true) = (Ey)T(a,q,y) = (not + ~Bg)- 


Now suppose that By is true. By (i),  ~By: Thence, by 
the hypothesis that “By is provable only if true, ~By is true, and 
hence by (11), By is false, contradicting our supposition. So by 
reductio ad absurdum, By is false. Thence, by (11), (a) ~By is 
true, but (b) not l= 4B Also from By is false, by the hypothe- 
sis that B, 1s provable only if true, (c) not | By: 

So the theorem is proved. Note that, by (1) By expresses 
its own unprovability. Thus“ By arises from the sentence in the — 
paradox of the Liar (Sec. 5, p. 41) by substituting "unprovable" 
for "untrue." Godel's 1931 proof was motivated thus. Cantor's 
diagonal method enters into the construction of the proposition 
(Ey )T(a,a,.y) (we go down the "diagonal" in T(i,x,y) when we equate 
i and x toq). 

The first part of Hilbert's program called for formalizing 
number theory, analysis, and a suitable part of set theory ina 
formal system S. Godel's theorem shows this can't be done com- 
pletely even for number theory. For ~B, expresses a number- 
theoretic proposition which by the theorem is true, yet unprova- 


ble, provided of course S$ satisfies the hypotheses of the theorem. 


Mathematical Foundations 155 


Let us consider these hypotheses. For each x, (Ey)T(x,x.y) 
is a meaningful proposition of elementary number theory, so if we 
can't find a formula By in S that expresses it, the symbolism 
of S is inadequate for number theory. We should of course be 
able to express also its negation (By )T(x,x.y); and it is merely 
a matter of convenience that we specified that #2 B.. should do this 
The next part of the hypotheses, that BL and OBS not be provable 
unless true, demands that the deductive apparatus of S$ accord 
with our interpretation of B and ~B, as expressing 
(By )T(x,x,y) and (Ey)T(x,x,.y), respectively. There is also one 
other hypothesis that entered tacitly, because we took it to be 
guaranteed by our conception of a formal system. This is that 
proofs in S can be recognized effectively, so that they can be 
listed effectively. If S did not do this, it would not serve 
the purpose for which we and Hilbert want formal systems. 

This, with the understanding that we can find B, for each 
x, and thus, given x and a proof, recognize that proof as being 
or not being a proof of BL: places us in a position to apply the 
Church-Turing thesis to conclude that there is the machine My as 
described. For a person who subscribes to the thesis, the proof 
of the theorem then excludes all hope that a formal system can be 
set up which is both correct and complete for the theory of the ; 
elementary number-theoretic predicate (Ey)T(x,x,y). This predi- 
cate has been used in the statement and proof of the theorem only 
in a finitary way. 

A person who does not subscribe to the thesis may have 
hope, but it will be futile so long as we can construct the My 
(which avoids appealing to the thesis) for each S he proposes. We 


now indicate (Theorem 5c) that this can be done for the case S$ is 


Mathematical Foundations 156 


N, i.e., either Ny or Ne (Sec. 14). At the same time we shall 
state the theorem in strictly metamathematical terms, avoiding 
mention of the interpretation, as Godel did in 1931. 

We call a system S with the symbolism of N (simply ) con- 
sistent if, for no formula B, are both B and ~/B provable in 8. 

We call a system>with the symbolism of Ny w-consistent 
if, for no formula A(y), are ~VyA(y) and all of A(0), A(1), 
A(2), ... provable in S; here A(n) is the result of substituting 
for the free occurrences of y in A(y) the numeral n, £65 
the term gue! with n accents. Via the translation from Nz 
to Ne, the notion extends to Ne. If S is -consistent, it is 
also (simply) consistent; for if + B and} ~B, then using Axiom 
Schema 3 with 1 and 4 every formula is provable in S, in particu- 
lar, ~wyA(y), A(O), A(1), A(2), «-- 


Theorem 5c. There is a closed formula By in N_ such 


that: (b) If N is consistent, ~B, is unprovable in N. (c) If N 
is ®-consistent, By is unprovable in N. 
Proof. We state the proof for N,. It can be shown that 


in N, there is a formula A(x,y), containing free only x and y, 


such that 
(454) if 1(x,x,y), thenb A (x, y) 
(iv) if T(x,x,y), then ~A(x, y) 
Da? ds 


(and which expresses T(x,x,y) under the usual interpretation of 
the symbolism). In fact, this formula is found first in getting 
the B, for the proof of Theorem 3, as specified in footnotes?°’?7 
Let B, be ayA(x, y), where x is the numeral o''''' with x 
accents. 

For this choice of B, and the particular system N,, it is 


a lengthy but routine task to find the index q of a machine My 


Men 


Mathematical Foundations 157 


to carry out the computation procedure described in the proof of 
Theorem 5b.%+ 

To establish (b), assume that N, is consistent. Prepara- 
tory to reductio ad absurdum, suppose that }- Be Then by (1), 
(Ey)T(q,a,y), 1-e., there is a y such that (q,q,y). Applying 
(441) for q and this y as the "x" and "y",- A(qoy), whence 
using Axiom Schema 41 and the —>-rule of Nj. b ayA(g.y), T iCaiy 
Kk By: This with fr ~By contradicts the hypothesis that N, is 
consistent. Hence by reductio ad absurdum, not k wBg: 

For (c), assume N, w-consistent, hence (simply) consistent, 
so by (b), not f ~By: Thence by (ii), (Ey)T(a,4,9), which is 
equivalent to (y)T(a,q,y), whence by (iv), ~a(q.0), ~A(q.1), 
~A(G2), +... are all provable. Now suppose (preparatory to 
weausers ad absurdum) that F By i.e., + ayA(a.y), whence using 
48 of Theorem 18, Chapter II with | for Ff, Fay ~aA(ay): These 
results contradict the w-consistency, with ~A(q.y) as the 
"a(y)". aa 

From (b) by (ii), we could add in the statement of the 
theorem "(a) If N is consistent, ~By is true", but we left it 
out to show how the theorem looks when we avoid reference to the 
interpretation. Theorem 5c shows the inadequacy of N to decide, 
by either proof or disproof, all the statements expressed in it 
by closed formulas; in particular, it won't do so for By which 
is hence said to be formally undecidable in N. This accounts for 
the title of Godel's 1931 paper, which (translated from the Ger- 


man) is "On formally undecidable propositions of Principia 


3ixkleene, loc. cit.,”’ Theorem 31, p. 258, with Theorem XXVIII 
p. 363. The latter is only needed because we are basing the 
treatment on Turing machines. 


SSS 


Mathematical Foundations 158 
Mathematica and related systems I." 

If N is simply consistent, then N with By added as a 
new axiom is a system which is simply consistent without being 
w-consistent. Rosser in 1936 found that by using a different 
eoemila B, in place of By, simple consistency would suffice in 
the (c) part of the theorem. To get this refinement here, we 
would substitute for My a machine M, which, given x, searches 
through the proofs of N, for one of ~/By and writes O if such 
a proof is found before a proof of By is found; a person having 
some familiarity with the development of number theory in Ni 
could easily make the necessary modifications in the proof of 
Theorem 5c. Also, by this method, the hypothesis of the first 
part of Theorem 3 can be simplified to "If N is consistent."9* 

Preparatory to Gddel's second theorem (Theorem 6), let My 
carry out the procedure, given any x, of searching through all 
pairs of proofs in N for two with the property that the last 
formula of the second is the negation of the last formula of the 
first, and writing O if such a pair is found. Thus My computes 
the constant function h(x) =O if N is inconsistent, and com- 
putes no value for any x if N is consistent. Thus: 

(v) (Ey)T(s,s,y) = (N is inconsistent). 
Using the B, of the proof of Theorem 5c, we now have: 


~B, is_a formula of N which, under the usual inter- 


pretation of N , expresses in N (via (v)) that N is simply con- 
sistent. 
There are other ways of constructing such a formula, a bit 


simpler perhaps, but this makes use of only the apparatus (the 


S2compare, e.g-, Kleene, loc. cit.,7? pp. 208, 314. 


Mathematical Foundations 159 


predicate (Ey)T(x,x,y) and formulas B,) already brought into our 
discussion. 
Theorem 6. If N is consistent, then the formula vB. 


which expresses this fact is not provable in N. 


Proof. According to Theorem 5c (b), 


(vi) N is consistent implies not + ~Bg: 


Now we know that "N is consistent" is expressible in N by 
the formula ~B,, and by (41) “not wa" is expressible by ~By: 
Thus the metamathematical statement (vi) can be expressed in N by 
the formula ~Be — ~By: 

If the axioms and rules of inference of N are adequate 
for elementary number-theoretic reasoning. such as we used ina 
metamathematical application to prove Theorem 5c (b), it should be 
possible to imitate the metamathematical proof of (vi) by a proof 
in N of the formula ~Bs —_ ~By: We shall assume that this is 
the fact, i.e., that in N,°> 


(vii) + ~B, —> “By: 


To complete the proof of Theorem 6, we assume now that N 
is consistent. Suppose (preparatory to reductio ad absurdum) that 
+ vB, in N. Thence by (vii) and the —>-rule, + wBLe contra- 
dicting Theorem Se (b). 

As a postscript to this proof, we emphasize that metamathe- 


matics can be construed as a branch of elementary number theory. 


S83the aforesaid "person having some familiarity with the develop- 
ment of number theory in N," would find it hard to doubt this. 
However, Hilbert and Bernays, loc. cit.?° vol. II, pp. 283 ff. 
especially pp. 306-324, did carry out in detail a proof of this, 
applying to a system essentially equivalent to N, though using a 
somewhat different system of numbering from the one we have intro- 
duced here in connection with Turing machine tables. 


Women 


Mathematical Foundations 160 
For consider a given formal system S as an object for the meta-~ 
mathematics. Say there are n formal symbols in the alphabet of 
S (cf. Sec. 14), and let us add the semicolon ";", supposed not in 
S, to use in separating formulas of S written in succession. Then 
py the method of digits, with a n+l-ary number system, all the 
metamathematical objects (including formal symbols, terms and for- 


mulas, proofs) can be reinterpreted as numbers.°* 


So not only is 
(vi) expressed in elementary number theory when we rewrite it 
"(Ey)t(s,s,y) —> (Zy)T(qa,q,y)," which in formal number theory be- 
comes the formula of (vii), but also all the metamathematical vo- 
ecabulary used in connection with it, specifically in the proof of 
Theorem 5c (b), can be expressed in number theory. So expressed, 
it is ready for the translation into N, i.e., the formalization, 
which gives us (vii). 

The second part of Hilbert's program for foundations called 
for showing by finitary metamathematical reasoning that a formal 
system chosen as a formalization of classical mathematics is con- 
sistent. As the mathematics formalized in N is not all finitary, 
the hope would seem to have been that the part of the methods 
formalized in N obtained by excluding the non-finitary ones would 
suffice for the consistency proof. Gddel's second theorem shows 
that not even all the methods formalized in N, i.e., a metalan- 
guage isomorphic with N itself, can prove the consistency of N, if 
N is consistent. 

Some mathematicians judged that this ended forever any hope 


of getting a guarantee for classical mathematics by a metamathe- 


34Me somewhat different method of numbering used by Godel in 
1931 is described in Nagel and Newman, loc. eit.2° Not this ex- 
actly, but a modification of it, is used by Hilbert and Bernays, 
loe. cit.2% Vol. II and in Kleene, loc. cit.2#6727 3} 


Mathematical Foundations 161 
matical consistency proof. 

Others thought it possible that methods could be found 
which could be considered as finitary even though not formalizable 
in N. Progress in proving the consistency of N had been stalled 
since the consistency proof for a subsystem of N by Ackermann in 
1924; some interesting new proofs, but still for essentially the 
same subsystem, were given by von Neumann in 1927, Herbrand in 
1930, and Gentzen in 1934. After the necessity of using some non- 
elementary method was revealed by Godel's second theorem in, 1931, 
it was not too long before in 1936 Gentzen gave a consistency 
proof for N. This employed, as its method not formalizable in N, 
induction over a certain segment of the transfinite ordinal num- 
bers, which Cantor had obtained by an extension of the counting 
process or ordinal use of numbers beyond the natural numbers, in 
association with "well-ordered" sets (analogous to his introduc- 
tion of transfinite cardinal numbers in association with unordered 
sets, Sec. 4). The induction was over the ordinals less than the 
ordinal called €, by Cantor.*5 One logician, asked whether he : 
felt more secure about classical mathematics from Gentzen's con- 
sistency proof, replied, "Yes, by an epsilon." A somewhat differ- 
ent consistency proof for N was given by Ackermann in 1940, also 
using transfinite induction over the ordinals < €o- 

It is clear that these consistency proofs work very hard to 
accomplish something, but less clear what. Recently (1951) 


Kreisel proposed by use of Ackermann's proof to give a "“Pinitist" 


S85Ror a little more of an indication, and references to the orig- 
inal papers, see Kleene, loc. cit.,” pp. 476-478 (dates there 
refer to the bibliography, pp. 517 ff.) 


Mathematical Foundations 162 
interpretation of classical number theory.°° Indeed, Gentzen had 
already claimed in the first (1936) version of his consistency 
proof to give a property of proved formulas of classical formal 
number theory N that can be regarded as an intuitive interpreta- 
tion, but the property was complicated and received little atten- 
tion after his proof appeared in another version (1958) easier to 
follow and not adducing the property. 

To see that there is a problem of interpretation, we recall 
from Sec. 13, pp. 114-115, Hilbert's distinction (1926, 1928) be- 
tween "real" propositions having a clear intuitive meaning, and 
other propositions called "ideal." In classical mathematics, the 
"ideal" propositions are adjoined to the "real." One might have 
supposed that the "real" propositions would include all the propo- 
sitions of elementary number theory. 

However the picture is not this simple. For already in 
elementary number theory there are propositions proved classically 
that cannot be true on the basis of their meanings for an intui- 
tionist. This was emphasized in 1943 by Kleene who argued as 
follows. 

The intuitionist understands an existential proposition 
(Ey)P(y) to mean that one can actually find a y such that P(y). 
From this standpoint what can (x)(Ey)P(x,y) mean? Only that 
there is an effective procedure by which, given any x, one can 
actually find a y such that P(x,y). By the Church-Turing thesis, 


this must mean that the y is a computable function of x. Thus 


86me easiest introduction to Kreisel's ideas is probably Kreisel, 
G., A variant to Hilbert's theory of the foundations of mathemat- 
ics, British Journal for the Philosophy of Science, Vol. 4 (1953), 
pp. 107-129. 


Mathematical Foundations 163 


we are led to the thesis that intuitionistically (x)(E) )P(x.v) 

only if there is a computable function g(x) such that 

(x)P(x,g(x)). (This is the third of three "theses" listed by 

Kleene in 1943, the first being Church's, and the second a thesis 

concerning formal systems which follows from the first, and en- 

tails the generalized Godel theorem, as we argued on p. 155). 
By the classical law of the excluded middle (which intui- 

tionists decline to affirm), for each x, 

(Ez)T(x,x,2) v (Ez)T(x,x,z). Thence 

((g2)(x,x,2) & 0 = o] v[(hz)t(x,x,2) &1=1]. tence 

(By) [(Bz )T(x,x,2) &y = 0] Vv (By) [THz )r(x,x,2) &y = ij}. Thence 

(ay){ [ (Bz )0(x,x,2) &y = oly [Tez )r(x,x,2) ky = aj}. This is 


for each x, so we have proved classically 


e 
(9) (x)(By)} [ (Bz )2(x,x,2) &y = oly [ Tz) (x,x,2) &y = 1]. 
We have presented this proof informally, but it is a simple matter 
to formalize it in N,, so that we have 

(2) b weays [mea(x,z) & y = 0] v [-aea(x,z) & y = 1|\ in Ma. 

Let us abbreviate (1) as (x)(Ey)P(x,y). By the above 
thesis, (1) only holds intuitionistically if (x)P(x,g(x)) for 
some computable function g(x). But from what P(x,y) is in this 
example, the only g(x) for which (x)P(x,8(x)) 1s the represent- 
ing function of (Ez)T(x,x,z), and by Theorem 2 of Sec. 16 this 
g(x) is not computable. 

Summarizing: (1) holds in classical informal number theory, 
and translates into a formula which as (2) states is provable in 
Ny, but (1) cannot be affirmed as true intuitionistically. Other 
examples, taken from analysis, of classical theorems that on the 


basis of the thesis are unprovable intuitionistically were given 


a : 


Mathematical Foundations 164 


18. 


by Specker in 1949. 

Kreisel's proposal uses Ackermann's consistency proof for 
Ny to correlate to (1) on the basis of (2) a considerably more 
complicated proposition, which is both meaningful and true for a 
"einitist." A "finitist," 1f not an intuitionist, is at least of 
a similar ilk. We make the distinction, because we are not sure 
whether the intuitionists, for whom Brouwer and Heyting can speak 
with authority, accept Kreisel's "finitist" interpretation unre- 
servedly.°7 

Formal systems can be set up to study the foundations of 
intuitionistic (as well as of classical) mathematics, as Heyting 
did in 1930, though the intuitionists have always maintained on 
philosophical grounds (from before Godel's theorem was known) that 
such systems can't be complete. Kleene, David Nelson and others 
have since 1944 been applying computable (or general recursive) 
functions to elucidate the differences between intuitionistic and 


classical formal systems .2° 


Godel's theorems and Skolem models 


The idea of this section is simple: Formal number theory 
equals number-theoretic axioms plus predicate calculus. By one 
theorem of Godel, formal number theory is incomplete. By another, 
the predicate calculus is complete. Therefore the number- 


theoretic axioms are incomplete. 


387Po learn about intuitionism, one may start with Heyting, A., 
Les fondements des mathématiques, Intuitionnisme, Théorie de la 
Démonstration, Gauthier-Villars, Paris and E. Nauwelaerts , 
Louvain, 1955, and Heyting, A., Intuitionism, An Introduction, 
Studies in Logic, North Holland Publishing Co., Amsterdam, 1956. 


880f, Kleene, loc. cit.,” } 82. More recent results are not yet 
published. 


Mathematical Foundations 165 


This idea will give us the theorem of Skolem (1933, 1934) 
on the impossibility of characterizing the natural number series 
by a system of axioms in the symbolism of the predicate calculus. 

The natural numbers can be considered as a mathematical 
ayeten (W, 0, ') where Vf is a set, O a member of Wy ana * Ge 
function from tH to | 3 or as a system (¥|, x = 0, x' = y) where 
ait is a set, x =O a one-place predicate over | , and x'=y a 
two-place predicate over j\ . Fundamentally it is immaterial whick 
we use; so we use the second to match with the symbolism of the 
predicate calculus. 

The onetow W111 apply to a finite or countably infinite 
set of formulas Ao, Ai, Ae, +++ of the predicate calculus which 
we would like to take as a system of axioms for the natural num- 
ber system (Yo x =0, x! =y). Without essential loss of gener- 
ality we can take Ay, Ai; Aas +--+ to be closed. Let the predi- 
cate letters which we intend should express x =y, xX =0, X'=J 
be E(x,y), Z(x), S(x,y). By an interpretation let us mean a 
choice of a domain D, and an assignment of predicates (or logical 
functions) over D to the predicate letters of Ao, Aas Bey ove Cas 
in Secs. 10-12 on the predicate calculus). Let us denote an in- 
terpretation I, by writing (D,» E, (x,y), Z,(x), S,(x,y)+++) 
where D, is the domain and E, (x,y), Z,(x), 8, (x.y) are the 
predicates assigned to E(x,y), Z2(x), S(x,y). 

Theorem 7. Let Ao» Ai, Aas eee be a given list of closed 


formulas in the predicate calculus containing (among others) the 


predicate letters E(x,y), 2(x), S(x,y). Suppose given an inter- 


pretation Ip = (Vl, x =y, x =0, x'=¥; ...) under which 
Ao» Ar, Ae, «++ are all true. Then there is another interpreta- 


tion Iz = (Di, x = ¥» 22:(x), S1(x.y), ...) under which 


Mathematical Foundations 166 


Ao» Ar, Az, «++ are all true such that (Di, Zi(%), Si(x,y)) is 


hot isomorphic to (iL, x =0, x' =y). 
Proof. Let Ros nt ee ... be the closures of the axioms 


of Ne, with the notation rearranged so that x = y, O(x), '(x.y), 
4(x,y.2), -(%,yoz) become E(x,y), 2(x), S(x,y), Alx,yy2) 
M(x,y,2) where A(x,y,z), M(x,y,z) are predicate letters which in 
Ip are interpreted by x+y = 2, X*y = z if such are available, 
and otherwise two new predicate letters, in which case we shall 
understand Ip to be extended by assigning x+y = 2, x-y =z to 
them. 

Now let a formal system § have as axioms Ag, Ao, Ai- Nats 
Ae, on ... and the axioms of the predicate calculus, and as 
rules of inference the three rules of the predicate calculus. Then 


S satisfies the hypotheses of Theorem 5b for I, as the interpre- 
tation. For $ includes Ne (i.e., with the notation rearranged), 


and in Ne (by the proof of Theorem 5c) to each x there is a 
closed formula B, expressing (Ey)T(x,x,y) under the interpreta- 
tion I, (since Ig includes the usual interpretation of Na). More- 
over B, and ~B, are provable in S only when true under I,, since 
Ags Be Ai> Ads Ae, AS, ... are all true under Ip, and by Chap- 
ter II Corollary Theorem 23, Theorem 25 and Corollary Theorem 21 
every formula provable in S$ (i.e., deducible by predicate calculus 
from Ao; Abs Ai, Ads Ae, hes ...) igs true under each interpreta- 
tion WHEGH GAMER ALl SE -Aos Bos Aas Bas dey Be, «-» “truer 

So by Theorem 5b, ~ By is true under I,, but unprovable 
in § (i.e., not deducible by predicate calculus from 
Ao» ne Ais Nee Aa, ree send) a 

But by Chapter II Corollary 1 Theorem 27, if ~Ba were 


true under every interpretation which makes all of 


Mathematical Foundations 187 
Ao» Ads Bax Als Aas Ro snees ORME.» = By would be provable in S. 
Hence there is another interpretation I' which makes 
Ao; Ao, Aa: a As. nes ~o. all true, but makes ~By false. 

A We do not know that in I' the predicate E'(x,y) assigned 
to E(x,y) is x =y- However, by the device used at the corre- 
sponding point in the derivation of Skolem's paradox (bottom 
p. 101), the domain D' can be altered, if necessary, to give an 
interpretation I, = (Di, x = y, 21(x), Sa(x.y). ...) which still 
Tales Gl) OS Dog Aigy Bay die Rew ids wow “true Dub ~B, false. 
This device depends on the presence of the equality axioms of Na, 
which are such that ay forming D! into “equivalence classes," by 
putting into a class with each x € D! all those y ¢€D! such 
that E'(x,y) is true, the truth or falsity of each of the other 
predicates in I' will depend only on the equivalence classes to 
which its arguments belong.?° 


The formula mBy contains only the predicate letters 


E(x.y), Z(x), S(x,y). A(x.y.2), MQ.y.z), of which E(x,y) is inter 


preted in I, by x =y- 


Suppose for reductio ad absurdum that (Di Z,(x), 81.7) 


is isomorphic to (CY. x =0, x' =y). This means that by replacing 


the members of D, by their indices in a certain enumeration, 
(Di, Z1(x), Si(x,y)) becomes CMs x = 0, x! = y), and to simplify 
terminology we shall suppose we have done so. 


Now in Dz, E(x,y), 2(x), S(x.y) are interpreted by x = y, 


x =0O, x! =y. The axioms of Ne include the equivalents under the 


translation process from N, described at the end of Sec. 14, 


pp. 126-8, of the recursion equations 


“x +0 x ( sia OD 
x+y! (x+y)! a xf yt 


il 
o 


u 


if 
tt 


xy + x 


Mathematical Foundations 168 
for + and ° as number-theoretic functions. These recursion 
equations define those functions, when the range of the variables 
is ai and =, ', +, * have their usual meanings. The truth of 
those axioms of Ne in I, now forces A(x,y,2), M(x,y,z) to be 
interpreted by x+y = 2, X-y =z, so that all the predicate 
letters of ~ Bg have the same interpretation in I, as in Io, 
contradicting that ~ By is true in Ip puff Mise im De 

Skolem proved the theorem by a direct method, which is 
shorter starting from scratch, but longer than this starting from 
the two Godel theorems (Theorem 5b and Chapter II Corollary 1 
Theorem 27). 

Skolem's proof describes the interpretation I,. and it may 
seem that we only have the absurdity that it shouldn't exist. How- 
ever from the unprovability of ~By in S the proof of Theorem 27 
actually provides the I' and thence the I,. the predicates in ig 
being defined by quite simple expressions of classical elementary 
number theory.?8 

One final comment on our proof: We tacitly assumed that 
as the phrase "given" in the hypothesis suggests, the formulas 
Ao; Ai. Aza, --. are given effectively as are the axioms of a 
formal system, so that Ao, Ans Ai, ity Ae, ree +e. Can be taken, 
with the axioms of the predicate calculus, as the axioms of S. 

The theorem seems to us of interest for foundations primarily when 
this is the case. However Skolem's proof didn't make this assimp- 
tion and the present proof can be freed of it by using throughout 


1 


a device of "relativization," which we haven't had the time to go 


88kleene loc. cit.7 Theorem& 38, p. 398, and—49—p+—-401. 


Mathematical Foundations 169 


ante.” 
The considerations used in our proof of Theorem 7 show the 
formal undecidability of B 


q 
acter as the undecidability of Euclid's "Parallel Postulate" from 


to be a phenomenon of the same char- 


the other postulates of Euclidean geometry (pp. 108-109). 

The axioms of formal number theory were intended to cover 
Peano's postulates. However the full force of the induction 
postulate is not obtained, because the axiom schema for it gives 
us an axiom only for each of the countably many formulas A(x), ~ 
while the informal postulate, pp. 25-26, is for the uncountably 
many number-theoretic predicates P(x). 

Translating from Skolem 1934, "...the Matural number/ 
series is completely characterized, for example, by the Peano ax- 
ioms, if one regards the notion ‘set! or ‘propositional function! 
as something given in advance with an absolute meaning independent 
of all principles of generation or axioms. But if one would make 
the axiomatics true to principle (Ger. konsequent), so that also 
the reasoning§ with the sets or propositional functions one axiom- : 
atized, then, as we have seen, the unique or complete characteri- | 


| 
zation of the number series is impossible."*? 


“*°nelativized versions will be found in Kleene, loc. cit.,” of 
all the results we have used, e.g., cf. pp. 224, 275, 292, top 

298, 362, 431. The relativization can be to that function of i 
which gives A, as a number in the n+l-ary notation (cf. p. 1L0). 


4lme reference to the original may be found in the bibliography 
of Kleene, loc. cit.,35 tains ef. pp. 429-432). The interpreta- 
tion I, can be called a "Skolem model." Ryll-Nardzewski's theo- 
rem mentioned on p. 103 was established by using a Skolem model. 
Theorem 7 and Skolem's paradox, p. 102, have given rise to "non- 
standard models" as a field of investigation; a "non-standard 
model" for a system of axioms is an interpretation at variance 
with the intended one. E.g., cf. Rosser, J. Barkley and Wang, Hao, 
Non-Standard models for formal logics, Journal of Symbolic Logic, 
Vol. 15 (1950), pp. 113-129. 


, 


